Since the launch of the Sharpe Centuri platform in early December 2017, our users have provided over a million predictions on the direction of S&P500 stock and cryptocurrency prices. We have performed an analysis of the first 5 months of predictions and will briefly highlight some insights we have gained into the behaviour of users and the quality of their work.
Sentiment data matured by the 26th of April 2018 (748k predictions) was filtered for active users who had provided a minimum of 500 predictions for at least 4 weeks. The following measurements were observed for this dataset:
Predictions included after the cutoff criteria: 648k
Unique user accounts: 258
Median delay between predictions across all users: 1.0 seconds
Reputation average ± standard deviation: 49.31 ± 1.42
Lowest reputation: 43.47
Highest reputation: 53.43
Whilst the majority of predictions (87%) have been provided by users who appear to actively use the platform, a large proportion of users have indeed focused on getting through predictions at a rather brisk pace, spending approximately a second per prediction. Below we show how the delay between predictions relates to the number of predictions and user reputation.
Looking at the number of predictions performed by the users and the resulting accuracy, we found that overall the users have some room for improvement. As users make more predictions, their overall ability to predict 500 different stocks or 200 different cryptocurrency assets approaches a mean accuracy of 50%, suggesting that it is difficult for users to improve the reputation score.
With approximately 3000 predictions per month available to users, it is evident that the vast majority (but not all) users have just about enough time to read the title of the asset before casting a vote on the direction of that asset. Moreover, this appears to have little impact on the resulting reputation score, which to many will suggest that proof-of-work is the easiest way for maximising payouts. As a result, we have been looking for ways to increase the importance of the reputation score without altering the underlying calculations. Many factors could be at play here and we will be considering all of the following scenarios to encourage the users to focus on improving their reputation score:
(1) Users do not yet have an overview of their own performance on different assets and the quality of the individual asset sentiment they have been providing. This will be addressed by allowing users to look back through the history of their predictions and learn from their past performance.
(2) Users have too many predictions to do and cannot afford to spend much time per prediction, but want to maximise payout through proof-of-work. We will reduce the number of predictions that need to be completed to alleviate the rush to the finish.
(3) All users currently get the same assets, with no control over which assets are presented to them. From a user experience perspective, we understand that being able to navigate the possible prediction lists by sector or the prediction time scales would encourage users to seek out better opportunities to increase the number of correct predictions, and therefore improve their reputation score over time.
The dataset was further grouped by assets in order to determine if some assets are better predicted by all users taken together. This approach assumes that we can trust all users to give sentiment on all assets, and the majority rule would suggest the consensus direction of the price. The matured sentiment was grouped into weekly bins (approx. 100 matured sentiment per asset per week), quantifying the number of users that predicted the asset direction correctly. The top 10 assets ordered by the average sentiment accuracy over a 3 month period presented below suggest that there are certain assets that all users taken together can predict the direction of the market by consensus. The average correctness over the 3 month period for those assets is ~60%, possibly giving an edge in predicting the market, but not without false positive predictions made for certain weeks on some assets.
On the other hand, looking into assets that were incorrectly predicted by consensus user sentiment (average <40% over the period) — we find that all of them are cryptocurrency assets. It appeared that users were especially good at predicting the direction of the market during the mid-April bull run. The exact nature of why there are stark differences between the ability of users to predict cryptocurrency and stock markets would require further observation of user predictions over time, including user behaviour when making predictions of these markets against USD.
Interestingly, whilst the majority of users are correct 50% of the time across all assets as evident by the mean reputation score of ~50%, looking deeper into individual user data we found that we can identify single users that may be exceptionally good at predicting specific stocks or crypto assets maturing in the same week. Specifically, we found that approximately a third (~90) of our users have consistently predicted the direction of one or more assets correctly at least 80% of the time over at least 6 independent events — these could be seen as trusted users in predicting their respective assets.
We found that relying on data provided by trusted users gives a significant boost in determining the direction of the market, with over 80% of a sample of 5–10 of those trusted users agreeing on the price direction for these assets each week. It is worth noting that the sample size for interpreting the quality of predictions at such an individual asset level for most assets is small — but will be investigated in a couple of months down the line once more data has been collected. This approach does however suggest that certain users have certain knowledge that they excel at, and we can capture this by monitoring their consistency across different assets in order to get a better understanding of the direction of each asset across such users. In the future, this gives us the opportunity to present users with assets that they predict consistently better.
During the coming months we will be rolling out updates that will address the first scenario targeting the reputation score significance (alongside other UI/UX updates), however we believe the last two scenarios cannot be addressed separately from one another and are likely to be implemented together in a single update towards the end of summer. The ultimate goal is to bring an engaging experience for the users and improve the overall quality of individual sentiment provided.
Overall, there are fascinating consistencies arising among approximately a dozen of users over a handful of assets on the platform. With the marketing activity kicking off this summer we will expand the user base and make improvements to give users more opportunities to learn and improve their quality of work. The product is still in the early stages of its life cycle and we will continue to monitor the performance of users to better understand their interactions with the platform in order to make Sharpe Centuri a rewarding experience for everyone.Talk to the Team