There’s an old joke about a someone who has lost his car keys and keeps looking for them under a street light, but with no success. After a while, a policeman finally asks why he doesn’t extend his search elsewhere. “Because that’s where the light is,” answers the man.
The current obsession with Big Data is somewhat reminiscent of this so-called “street light effect” – the tendency to look for answers where they are easiest to look for, not most likely to be found.
In fact, whether or not a big-data search party is likely to discover something useful really depends on the kinds of data that are at hand. Computers are really good at processing data that are well structured: digital, clean, explicit and unambiguous. But when the data are unstructured – analog, noisy, implicit or ambiguous – human brains are better at making sense of them.
Whereas a single human brain, or a modest personal computer, may deal with small data sets of the preferred kind, the “bigger” the data is, the more computing power has to be brought to bear. In the case of structured data, bigger computers will come in handy, but in the case of unstructured data – the kind computers can’t properly deal with – there’s also a hard limit on how much computing power a single human brain can deliver. So the best way to make sense of big unstructured data sets is to tap into the collective intelligence of a multitude of brains.
The best kind of computing power to bring to bear on big data depends on the kind of data that has to be processed. Collective intelligence delivers the best performance when dealing with big unstructured data sets.
When the goal is to peer into the future, statistical big-data approaches are especially brittle, because the data at hand are necessarily rooted in the past. That’s ok when what you are trying to forecast is extremely similar to what has already happened – like a mature product line in a stable market – but it breaks down disgracefully when you are dealing with brand new products or disrupted markets.
Here are just a few examples of situations we have encountered where collective forecasting proves superior to data-driven projections:
Disrupted market: When in the mid-2000 the world-wide demand for dairy products suddenly increased three-fold in the space of a few months, after a decade of stability, dairy product producers could not rely any more on their data-driven forecasting models. Instead, they tapped into the collective forecasting insights of their people on the ground, closest to the customers, to better understand and model the new demand drivers.
New products: A few years ago Lumenogic collaborated with a team of marketing researchers to run a prediction market within a Fortune 100 consumer packaged firm, focusing on new products. When compared to the forecasts issued from the classic data-driven methods, the researchers found that the collective forecasts provide superior results in 67% of the cases, reduce average error by approximately 15 percentage points, and reduce the error range by over 40%.
Political elections: In the past 20 years, prediction markets have become famous for their ability to outperform polls as a means to forecast electoral outcomes. So much so that a skewer of distinguished economists eventually petitioned the U.S. government to legalize political betting for the benefit of society – which it did recently, to some extent, as evidenced by the recent launch of PredictIt. The big-data camp fought back in the form of poll aggregators, as popularized by statistical wizard Nate Silver, and further enriched by other non-poll data sets such as campaign contributions, ad spend, etc. To no avail. In last november’s U.S. Midterm elections, the collective intelligence of Hypermind’s few hundred (elite) traders outperformed all the big data-driven statistical prediction models put forth by major media organizations. That’s because the wisdom of crowds is able to aggregate a lot of information – unstructured data – about what makes each election unique, whereas this data lies out of the reach of statistical algorithm, however sophisticated.
Despite the current and growing flood digital data – the kind computers and algorithms can deal with – we should not lose sight that the world offers magnitudes more unstructured data – the kind only human brains can collectively make sense of. So if you ever find yourself searching fruitlessly under that big-data street light, remember that collective intelligence may provide just the night goggles you need to extend your search.