Bayesian Combination of State Polls and Election Forecasts
September 21st, 2008, by Andrew
National elections are predictable from fundamentals (see, for example, the research of Steven Rosenstone, James Campbell, Robert Erikson, and Chris Wlezien, along with many others), but this doesn’t stop political scientists, let alone journalists, from obsessively tracking swings in the polls. The next level of sophistication–afforded us by the combination of ubiquitous telephone polling and internet dissemination of results–is to track the trends in state polls, a practice which was led in 2004 by Republican-leaning realclearpolitics.com and now in 2008 at fivethirtyeight.com, a website maintained by Democrat (and professional baseball statistician) Nate Silver.
Presidential elections are decided in swing states, and so it makes sense to look at state by state polls. On the other hand, the relative positions of the states are highly predictable from previous elections. So what is to be done? Is there a point of balance between the frenzy of daily or weekly polling on one hand, and the supine acceptance of forecasts on the other? The answer is Yes, a Bayesian analysis can do partial pooling between these extremes. We use historical election results by state and campaign-season polls from 2000 and 2004 to estimate the appropriate weighting to use when combining surveys and forecasts.
CLICK HERE for more on what we did (research article by Kari Lock and myself).
And here’s an illustration of the method, based on the February SurveyUSA polls of the Clinton vs. McCain and Obama vs. McCain matchups in each of the 50 states:

The short answer is that the polls in individual states–even those large Survey USA polls–have a lot less information than you might think. In some cases the polls are probably telling you something real–for example, in Arkansas, Clinton would do better against McCain than Obama would. But those maps people were making showing which states Clinton or Obama would win–those were drastic overinterpretations of transient poll data.
You can’t just take the state polls, slap on standard errors, and think you’re capturing the uncertainty about the election outcome.
The key idea is to separate the forecasting information at the national level from the information about the relative positions of the states. These are really two different things. The forecasts (and, to a lesser extent, the polls) tell you about Obama and McCain’s strength nationally, and about each candidate’s strength in Ohio (say) relative to his national strength. It’s not statistically efficient to look at Ohio, or any other state, in isolation.
Similar Posts:
- Predicting the election outcome months ahead of time: discussion and link to revised paper with Kari Lock
- Election 2008: what really happened
- 2004/2008
- Florida or Ohio? Forecasting Presidential State Outcomes Using Reverse Random Walks
- The nonpuzzle of the close election polls
Entry Filed under: Polls

1 Comment
Add your own1. Andrew Gelman’s att&hellip | September 22nd, 2008 at 5:00 am
[...] Chris F. Masse September 22nd, 2008 Andrew Gelman’s post [...]
Leave a Comment
Some HTML allowed:
<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>
Trackback this post | Subscribe to the comments via RSS Feed