How analzying Wikipedia page views could help you make money
How analzying Wikipedia page views could help you make money:
from GigaOM
Plenty of companies have been looking at software for analyzing private large data sets and combining it with external streams such as tweets to make predictions that could boost revenue or cut expenses. Walmart, for instance, has come up with a way for company buyers to cross sales data with tweets on products and categories on Twitter and thereby determine which products to stock. Here’s another possible data source to consider checking: Wikipedia.
No, this doesn’t mean a company that wants to predict the future should take a guess based on what a person or company’s Wikipedia page says. However, researchers have found value in page views on certain English-language Wikipedia pages. The results were published Wednesday in the online journal Scientific Reports.
The researchers looked at page views and edits for Wikipedia entries on public companies that are part of the Dow Jones Industrial Average, such as Cisco, Intel, and Pfizer, (pfe) as well as wikis on economic topics such as capitalism and debt. Changes in the average number of page views and edits per week informed decisions on whether to buy or sell the DJIA. In other words, a major increase in page views could have prompted a sale, followed by a buy to close out the deal, or vice-versa (decreases in page views, say, would cause a buy, followed by a sale).
The researchers compared this investment strategy with a random investing strategy. What they found is that returns based on views of the DJIA company Wikipedia pages “are significantly higher than the returns of the random strategies,” to the tune of a 141 percent return, according to a news release.
How returns on strategies based on Wikipedia view and edit data for Wikipedia entries on companies in the Dow Jones Industrial Average, courtesy of Scientific Reports.
Returns on strategies based on view and edit data for Wikipedia entries on economic topics, via Scientific Reports
Incidentally, some of the researchers behind this project have also investigated connections between the Dow Jones and the use of certain financial search terms on Google. Other researchers have previously found connections between Google search patterns on stocks and stock price changes over time.
While predictive analytics has become a hot area — with applications from social media conversations to crime, from the flu to retweets — data scientists often acknowledge that people need to be sure the data they want to use for analysis is solid and reliable. Edit data from Wikipedia isn’t inherently reliable in the sense that anyone can edit it — and it turns out to be not statistically significant. Page views could perhaps be manipulated by a computer pinging Wikipedia again and again, which could throw off an algorithm pulling page view data in real time.
And tweets can be all over the place — there’s no style guide or fact checking for Twitter. So getting a good read on sentiment based on tweets from, say, Stocktwits can be hit or miss. And Google’s Flu Trends feature, heralded as an early use of crowdsourced data, reportedly overestimated flu breakout late last year.
Clearly, there are caveats to these data sets. Still, it’s neat to see new models emerging on the uses of public data, and some people who want to make money off Wikipedia metadata might want experiment with it. Just don’t blame us if the experiments backfire.
Related research and analysis from GigaOM Pro:
Subscriber content. Sign up for a free trial.
- What’s driving the next phase of the e-commerce evolution
- GigaOM Research highs and lows from CES 2013
- How HR can make the case for workforce analytics
No comments:
Post a Comment