The power and risks of alternative data


The power and risks of alternative data
Image courtesy of Shutterstock

To some, it may seem like you need psychic powers to predict whether an investment will be profitable or not. These days however, you don’t need to be a fortune teller or a psychic to make accurate predictions; the key to making good investments and profitable financial decisions lies in making sound predictions through the use of data. 

With the advent of the latest technologies, traditional data sources like financial statements, management presentations, SEC filings, analysts' forecasts, press releases, etc. are no longer sufficient enough to gain a competitive edge. In this endless race to outperform others, the importance of alternative data has increased tremendously.

Alternative data refers to non-traditional insights that investors obtain from numerous data sources, be it user behaviour to feedback from internet of things (IoT) sensors, to make an investment decision regarding a company's future performance. The demand and adoption of alternative data has been growing in the past few years and more so in 2020. It is also likely that its use will increase in the coming years, especially as interest in retail investors and stay-at-home traders’ investment decisions are growing and becoming increasingly difficult to ignore.

Sources of Alternative Data (Alt-Data)

One of the most prevalent sources of alt-data is information on individuals and their behavioural patterns. Alt-data can provide up-to-date information, filling the gap where traditional data is lacking; it is essentially data derived from external sources. The global market size of alternative data is expected to be $17.35 billion by 2027, with a compound annual growth rate (CAGR) of 40.1 per cent during that period, compared to last year’s $1.64 billion, according to Grand View Research.

We, as individuals, provide alternative data across various platforms. For instance, our application usage on smartphones, sensors, satellites, IoT-enabled gadgets is being monitored, and data related to user engagement is continuously obtained and sold off to interested third parties. In addition to this, our transaction data generated from credit and debit cards and monetary exchanges also provide deep insights regarding our spending patterns and hobbies.

Even in relation to capital markets, there are numerous alternative data categories, but the most impactful category is gauging/altering sentiment over social media platforms, open communities and crowdsourcing platforms. Applying explicit analytics on the gathered data allows investors to access extra knowledge, which had been previously unavailable. This allows investors to better evaluate investment opportunities, thereby making alt-data a valuable tool for investment management firms seeking alpha - excess returns.  

The impact of this alternative data is especially highlighted nowadays due to the GameStop stock fiasco. The stock price of Grapevine, a Texas-based video game company, increased from $20 to $483 in a short period of just two weeks in January 2021.

The price skyrocketed after a group of amateur day traders observed an opportunity and capitalised on it; hedge fund short-selling GameStop stock, with more shares shorted than available in the market. The day traders with other traders on Reddit deliberately started to accumulate as much GameStop stock as possible and move the price upwards. Coupled with the power of social media, like Elon Musk’s “Gamestonk” tweet, Gamestop’s stock price soared. Sure enough, everyone jumped on the bandwagon, and the stock price spiked up significantly and resulted in the hedge funds losing a significant amount to cover their short positions. A paper by The Institute for New Economic Thinking at Oxford found that users who comment on one discussion involving a particular asset are 4x more likely to start a new discussion on the asset in the future, showing that investment strategies are formed through social interaction.

This influence of alternative data and social media also demonstrated when Musk tweeted “use Signal” - the centralised encrypted messaging app back in January, and the price share of Signal Advance Inc. (an unrelated medical advice company) soared. He moved prices over and over again with Clubhouse, Dogecoin, and Bitcoin.

Data firm Thinknum saw Gamestop mentions increase days before the price went up and launched its Reddit-focused data set. Another data firm, Stockpulse, saw the buzz in early December 2020. According to Quiver Quantitative’s backtest, a strategy of buying and holding the five most-mentioned companies on Wallstreetbets in the previous week could have returned 61 per cent in 2020.

On the street, there is the “wisdom of crowds”, a concept which states that a group or crowd is more rational and intelligent in making predictions than an individual. However, this raises the question of whether what happened with GameStop was the result of group thinking.

Well, not really. In the case of GameStop, the crowd turned into a mob. It became irrational and outrightly started betting against the hedge funds. Numerous day traders benefited from this situation and started gambling on the stock without basing their decisions on fundamental analysis, while others saw that the stock is worth more than $5-$10 with the changes that were happening like their partnership with Microsoft. The qualitative evidence on Wallstreetbets shows that people on the forum entered high-risk positions posted by others due to fear of missing out on the profits. For any platform with collective wisdom to become successful, four things are necessary: first, the opinion should be diverse; second is independence, people should pay attention to their own information. Third, the experience must be decentralised, and finally, the collection method should be appropriate. On the other hand, an Imperial College paper concluded that reward-based crowdfunding platforms such as Kickstarter and Indiegogo show patterns consistent with the wisdom of the crowd. 


Individuals, in general, are now more aware and empowered than before in terms of understanding what personal data constitutes and why companies need and want to access it. Privacy has been a hot topic, with regulators worldwide placing a huge emphasis on data aggregation and piecing together an individual's identity. Alternative data is usually more anonymised than traditional data, but privacy remains at risk. Alternative data can contain personally identifiable information (PII), which could be used to identify a living individual in the data set; this risk should be actively managed. Owning material non-public information (MNPI) is also a risk; just because data is accessible to tech-savvy programmers, it does not mean it is public information.

Alternative data should be handled with care, another challenge regarding alternative data would be to identify and sort the useful insights from the noise and non-compliant data on different platforms. Since each data set may be unique, investment teams might find it difficult to confirm data accuracy, thereby raising concerns about inaccurate trading signals. This also means that machine dictionaries will need constant updating to keep up as new slang and emojis continuously emerge.

There is no doubt that alternative data has proved itself to be quite powerful. That is why there is a need to harness its power to benefit both investors and the financial system at large while taking risks into account.

Thank you

Please check your email to confirm your subscription.