Polls are dead, long live markets

NFCROWD

The polling fiasco in the 2015 UK general election is just the latest in a string of high-profile failures over the last few months. This contrasts with the good performance of prediction markets, and Hypermind in particular.

Let’s start with the referendum on Scottish independence in september 2014. In the final weeks before the referendum, the polls consistently announced a cliffhanger with Yes and No tied within the margin of error. Yet the actual results gave “No” a large majority of 55%, 10 points ahead of “Yes” (45%).

The betting markets on the other hand clearly favored the “No” vote throughout. Witness for instance how the “Yes” vote on Hypermind always stayed below the 50% likelihood threshold, and was given a low probability just before the referendum took place on September 18th.

SCOTLAND

Then came the midterm congressional elections in the U.S., in november 2014. The big question then was whether the Republicans would recapture control of the Senate, which they did. The polls mostly saw this coming, but were much more timid in their forecasts than the betting markets.

In fact, as discussed earlier in this blog, Hypermind out-predicted all the poll-aggregation models operated by the biggest U.S. media, as well as Nate Silver’s FiveThirtyEight. (Only the Washington Post model ended-up out-predicting Hypermind at the very end, but its prediction was all over the place beforehand, as can be seen in the chart below.)

midterms2014Senate

The Israeli elections in March 2015 again stumped the polls and the pundits. The closer we got to election day, the more Benyamin Netanyahu was given up for dead, politically. The latest polls even predicted his Likud party would be 4 seats behind his leftist rival, and considered how difficult it would be for him to assemble a 61 seat majority coalition in the Knesset. Instead, Likud scored 6 more seats than its closest rival, and Bibi was able to remain prime minister for a 4th term.

What about the betting markets ? On the day before the election, while noting that the election was a rare instance of an “actual tossup“, the New York Times also noted that Hypermind was giving Netanyahu 55% chances of staying prime minister. In fact, Hypermind had clearly kept Netanyahu in the favorite seat all through the campaign.

BIBISTAY

Which now brings us to the UK general election 2015. It concluded yesterday with a big win for David Cameron’s Conservative Party, a hair-breadth away from an absolute majority in parliament. This was in contrast to all the polling data which had Labour tied with the Conservatives, both very far from a majority. Based on the poll projections of a hung parliament, the pundits could not see how Cameron could gather a governing coalition, even when adding up Ukip and the LibDems. Everyone gave Labour’s Miliband a much better chance of forming a government, with tacit support from the Scottish Nationalist Party. In fact, the polls gave the Labour+SNP a clear majority in the House…

The story was different in the betting markets. At worst, Cameron’s chances of forming the next government remained close to 50%, tied with Labour’s Miliband’s, a far cry from the large Labour advantage everyone assumed from the parliamentary arithmetic based on poll projections. On Hypermind, a Cameron rebound even occurred just before election day.

UKPRIME-full-en

It will take some time to understand why election polls, which had served the media so well for so long, seem to be suddenly experiencing a global meltdown. Perhaps the simple, powerful idea of the “representative panel” just no longer works well when individualism is pushed to the extreme in modern societies…

What is encouraging, though, is that betting markets – an approach that preexisted polls by decades – are proving more reliable, especially when the going gets tough. This is probably related to the idea, explored earlier in this blog, that predicting human affairs is in general best left to human brains than to algorithms and statistics.

Hypermind out-predicts big-data models in the 2014 U.S. midterm elections

With Intrade gone and the rise of sophisticated statistical models à la FiveThirtyEight operated by various U.S. media, we haven’t heard much about prediction markets during the 2014 U.S. midterms election cycle. It was as if the allure of big data and statistical rock stars like Nate Silver had eclipsed the robust and well-documented success of collective human intelligence. Are prediction markets doomed to be road kill on the big-data super highway?

Not so fast.

In head-to-head comparisons, the Hypermind prediction market offers evidence that the aggregated brain power of a prediction market can still outpredict the much-hyped statistical machines.

Hypermind listed several stocks on the midterm elections in the 2014 U.S., focusing on control of the Senate and the 5 most undecided individual races in Kansas, Iowa, North Carolina, Colorado, and Georgia. This allows comparisons between Hypermind’s predictions and those of the 7 major statistical models: FiveThirtyEight (Nate Silver), Washington Post, New York Times, Huffington Post, Princeton Election Consortium, PredictWise, and Daily Kos.

In the analysis below we are comparing the predictions of each model against Hypermind, against each other, and against the average prediction of the 7 models. Importantly, we are not just comparing predictions made on election day, but throughout the weeks or months – depending on the question – during which the market and all models were simultaneously spewing predictions.*

Accuracy is measured using brier scores, which actually compute the error between the predictions and the true outcomes. The smaller the brier score, the better the prediction: a perfect prediction has a brier score of 0, while a chance prediction – think 50/50 – has a brier score of 0.5, and a totally wrong prediction scores 2.

To get a sense of how the methods compared overall, we computed for each question the brier score of each method every day throughout the comparison period. Then we averaged those daily brier scores into a mean daily brier score for each method and each question. Then we averaged those across the 6 questions to get an overall mean daily brier score for each method.**

The chart below plots the results. By this measure, all models except Princeton’s did slightly better than chance, but Hypermind out-predicted all of them, including the average prediction of all the models (“Models Mean”).

midterms2014Overall

We then took a closer look at these elections’ most important question: would Republicans win control of the Senate? In this case, Hypermind again out-performed all the models, as can be seen in the chart below. Except for the Washington Post’s, all the models remained, throughout the comparison period, much less confident than Hypermind in the Republican’s ultimate control of the Senate.

 midterms2014Senate

The Washington Post model, although more unstable that any other – notice the large dip around 50% from late august to mid-september – did particularly well at the end of the campaign, so Hypermind’s advantage isn’t as visually obvious as it is against the other models. However, if we compute the average daily brier scores over the entire period during which the Washington Post and Hypermind operated in parallel – from early july to election day – we find a 36% accuracy advantage for Hypermind (.096) over the Washington Post (.150).

There is an important lesson to be learned here: even in this age of big data and super computers, human collective intelligence is still our best means of predicting the future. Isn’t that reassuring?

Notes
(*) The periods of comparison for each question were as follows: Senate Control [sept. 3 to nov. 4]; IA, KS, CO, NC [oct. 9 to nov. 4]; GA [oct. 20 to nov. 4].
(**) Computing mean daily brier scores over entire forecasting periods, like we do here, is also how the geopolitical predictions of the IARPA-sponsored Good Judgment Project are being scored by the U.S. Government.
Data Sources
Hypermind’s daily closing prices for each contract are available for download in Excel format.
Models’ data were recorded by the New York Times here and here. Available for download courtesy of the Upshot’s Josh Katz.