New Hampshire: So What Happened?
There is obviously one and only one topic on the minds of those who follow polls today. What happened in New Hampshire? Why did every poll fail to predict Hillary Clinton's victory?
Let's begin by acknowledging the obvious. There is a problem here. Even if the discrepancy between the last polls and the results turns out to be about a big last minute shift to Hillary Clinton that the polls somehow missed (and that certainly sounds like a strong possibility), just about every consumer of the polling data got the impression that a Barack Obama victory was inevitable. One way or another, that's a problem.
For the best summary of the error itself, I highly recommend the
graphics and summary Charles Franklin posted earlier today. Here's a highlight of how the result compared to our trend estimates:
What we see for the Democrats is quite stunning. The polls actually spread very evenly around the actual Obama vote. Whatever went wrong, it was NOT an overestimate of Obama's support. The standard trend estimate for Obama was 36.7%, the sensitive estimate was 39.0% and the last five poll average was 38.4%, all reasonably close to his actual 36.4%.
It is the Clinton vote that was massively underestimated . . .Clinton's trend estimate was 30.4%, with the sensitive estimate even worse at 29.9% and the 5 poll average at 31.0% compared to her actual vote of 39.1%.
So what went wrong? We certainly have no shortage of theories. See Ambinder, Halperin, Kaus, and, for the conspiratorially minded, Friedman. The pollsters that have weighed in so far (that I've seen at least) are ABC's Gary Langer (also on video), Gallup's Frank Newport, Scott Rasmussen and John Zogby. Also, Nancy Mathiowetz, president of the American Association for Public Opinion Research (AAPOR) has blogged her thoughts on Huffington Post. [For the links to all the above, see original article - nimh]
Figuring out what happened and sorting through the possibilities is obviously a much bigger task than one blog post the morning after the election. But let me quickly review some of the more plausible or widely repeated theories and review what hard evidence we have, for the moment, regarding each.
1) A last minute shift? - Perhaps the polls had things about "right" as of the rolling snapshot taken from Saturday to Monday, but missed a final swing to Hillary Clinton that occurred over the last 24 hours and even as voters made their final decisions in the voting booth. After all, we knew that a big chunk of the Democratic electorate remained uncertain and conflicted, with strong positive impressions of all three Democratic front-runners. The final CNN/WMUR/UNH poll showed 21% of the Democrats "still trying to decide" which candidate they would support, and the exit poll showed 17% reported deciding on Election Day with another 21% deciding within the last three days. Polls showed Clinton polling in the mid to upper 30s during the late fall and early winter before a decline in December. Perhaps some supporters simply came home in the final hours of the campaign.
I did a quick
comparison late last night of the crosstabs from the exit polls and final CNN/WMUR/UNH survey. Clinton's gains looked greatest among women and college educated voters. That pattern, if it also holds for other polls (a big if) seems suggestive of a late shift tied to the intense focus on Clinton's passionate and emotional remarks, especially over the last 24 hours of the campaign.
2) Too Many Independents? - One popular theory is that polls over-sampled independent voters who ultimately opted for a Republican ballot to vote for John McCain. I have not yet seen any hard turnout data on independents from the New Hampshire Secretary of State, but the exit poll data does not offer promising data for this theory. As I blogged yesterday, final Democratic polls put the percentage of registered independents (technically "undeclared" voters) at between 26% and 44% (on four polls that released the results of a party registration question). The exit poll reported the registered independent number as 42%, with another 6% reporting they were new registrants. So if anything polls may have had the independent share among Democrats too high.
On Republican samples, pre-election pollsters reported the registered independent numbers ranging between 21% and 34%. The exit poll put it at 34%, with 5% previously unregistered. So here too, the percentage of independents may have been too low.
Apply those percentages to the actual turnout, do a little math, and you get an estimate of how the undeclared voters split: roughly 60% took a Democratic ballot and 40% a Republican. That is precisely the split that CNN/WMUR/UNH found in their last poll although other
Keep in mind that the overall turnout was over
526,671 (or 53.3% of eligible adults). Eight years ago (the last time both parties had contested primaries) it was 396,385 (or 44.4% of eligible adults at the time). That helps explain why we may have seen an increase in independents in both parties.
Of course, we are missing a lot of data here: Nothing yet on undeclared voter participation from the Secretary of State, and roughly half the pollsters never released a result for party registration.
3) Wrong Likely Voters? OK, so maybe they had the independent share right, but perhaps pollsters still sampled the wrong "likely voters" by some other measure. The turnout above means that pollsters had to try to select (or model) a likely electorate that amounted to roughly half the adults in New Hampshire, they reached with a random digit dial sample.
Getting the right mix is always challenging, possibly more so because the Democratic turnout was so much higher than in previous elections. That's an argument
blogged today by Allan McCutcheon of Edison Research:
In 2004, a (then) record of 219,787 voters turned out to vote--the previous record for the Democratic primary was in 1992, when 167, 819 voters participated. This year, a record shattering 287,849 voters participated in the New Hampshire Democratic primary--including nearly two thirds (66.3%) of the state's registered Democrats (up from 43.3% in 2004). Simply stated, the 2008 New Hampshire Democratic primary had a voter turnout rate that resembled a November presidential election, not a usual party primary, and the likely voter models for the polling organizations were focused on a primary--this time, that simply did not work.
One way to assess whether polls sampled the wrong kinds of voters would be to look carefully at their demographics (gender, age, education, region) and see how they compared to the exit poll and vote return data. Unfortunately, as is so often the case, only a handful of New Hampshire pollsters reported demographic composition.
4) The Bradley/Wilder effect? The term, as wikipedia tells us, derives from the 1982 gubernatorial campaign of Tom Bradley, then the long time African-American mayor of Los Angeles. Bradley led in pre-election polls but lost narrowly. A similar effect, in which polls understated the support for the opponents of African-American candidates seemed to hold in various instances during the 1980s. Consider this summary of polls compiled by the Pew Research Center for a
1998 report, which they updated in
February 2007:
(click to enlarge)
Note that, in almost every instance, the polls were generally about right in the percentage estimate for African-American candidate but tended to underestimate the percentage won by their white opponents. The theory is that some respondents are reluctant to share an opinion that might create "social discomfort" between the respondent and the interviewer, such as telling a stranger on the telephone that you intend to oppose an African-American candidate.
Of course, the Pew Center also looked at six races for Senate and Governor in 2006 that featured an African-American candidate and did not see a similar effect. Also keep in mind that that all of the reports mentioned above that show the effect were from general election contests, not primaries.
What other evidence might suggest the Bradley/Wilder effect operating in New Hampshire in 2008? We might want to consider whether the race of interviewer or the use of an automated (interviewer-free) methodology would have an effect, although these kinds of analyses are difficult, because other variables can confound the analysis. For what it's worth, the final Rasmussen automated survey had Obama leading by seven points (37% to 30%), roughly the same margin as the other pollsters. We might also look at whether pushing undecided voters harder helped Clinton more than other candidates.
Update: My colleagues at AAPOR have made
three relevant articles from Public Opinion Quarterly available to non-subscribers on the AAPOR web site.
5) Non-response bias? We would be crazy to rule it out, since even the best surveys are getting response rates in the low twenty percent range. If Clinton supporters were less willing to be interviewed last weekend than Obama supporters, it might contribute to the error. Unfortunately, it is next to impossible to investigate, since we have little or no data on the non-respondents. However, if pollsters were willing to be completely transparent, we might compare the results among those with relatively high response rates to those with lower rates. We might also check to see if response rates declined significantly over the final weekend.
6) Ballot Placement? Gary Langer's review points to a
theory offered by University Prof. Jon Krosnick, that Clinton's placement near the top of the New Hampshire ballot boosted her vote total. Krosnick believes that ballot order netted Clinton "at least 3 percent more votes than Obama."
7) Weekend Interviewing? I
blogged my concerns on Sunday. Hard data on whether this might be a factor are difficult to come by, but it is certainly an issue worth pursuing.
8) Fraud? As
Marc Ambinder puts it, some are ready to believe "[t]here was a conspiracy, somehow, because pre-election polls are just so much more valid than actual vote counts." Put me down as dubious, but Brad Friedman's
Brad Blog has the relevant Diebold connections for those who are interested.
Again, no one should interpret any of the above as the last word on what happened in New Hampshire. Most of these theories deserve more scrutiny and I agree with
Gary Langer that "it is incumbent on us - and particularly on the producers of the New Hampshire pre-election polls - to look at the data, and to look closely, and to do it without prejudging." This is just a quick review, offering what information is most easily accessible. I am certain I will have more to say about this in coming days.
-- Mark Blumenthal