Poststratification of 2016 American National Election Study data

The following data file was created in an earlier script / notebook.

load("anes-2016-prevote.RData")

There must not be any missing values in the stratifying variables.

anes_2016_vprevote <- subset(anes_2016_prevote,
                                    vote16 != "Inap" &
                                    recall12 != "Inap"
                                    )

In order to make poststratification possible, we need to make sure that the levels of the stratification variables match the population information. Therefore we relabel the variables “recall12” and “vote16”.

The following makes use of the memisc package. You may need to install it from CRAN using the code install.packages("memisc") if you want to run this on your computer. (The package is already installed on the notebook container, however.)

library(memisc)
Loading required package: lattice
Loading required package: MASS

Attaching package: 'memisc'

The following objects are masked from 'package:stats':

    contr.sum, contr.treatment, contrasts

The following object is masked from 'package:base':

    as.array
anes_2016_vprevote <- within(anes_2016_vprevote,{
    recall12 <- recall12[,drop=TRUE]
    vote16 <- vote16[,drop=TRUE]
    recall12 <- relabel(recall12,"Did not vote"="No vote")
    vote16 <- relabel(vote16,
                      "Will not vote/Not registered"="No vote")
})
save(anes_2016_vprevote,file="anes-2016-vprevote.RData")

Finally, we set up a survey design object. The following makes use of the survey package. You may need to install it from CRAN using the code install.packages("survey") if you want to run this on your computer. (The package is already installed on the notebook container, however.)

library(survey)
Loading required package: grid
Loading required package: Matrix
Loading required package: survival

Attaching package: 'survey'

The following object is masked from 'package:graphics':

    dotchart
anes_2016_vprevote_desgn <- svydesign(id = ~psu_f2f,
                                       strata = ~strat_f2f,
                                       weights = ~pre_w_f2f,
                                       data = anes_2016_vprevote,
                                       nest = TRUE)
save(anes_2016_vprevote_desgn,file="anes-2016-vprevote-design.RData")

We collect the electoral results of 2012 to prepare poststratification.

result.2012 = c(Obama  = 65915795,
                Romney = 60933504,
                # Other candidates are combined
                Other = sum(c(
                    Johson = 1275971,
                    Stein  =  469627,
                    Others =  490510
                )))

The number of non-voters is computed from the sum of the results and census data on the population in voting age.

result.2012 <- c(result.2012,
                 "No vote" = 235248000 - sum(result.2012)) 
# Here we collect the results for 2016
result.2016 <- c(Clinton = 65853514,
                 Trump   = 62984828,
                 Other   = sum(c(
                    Johnson  = 4489341, 
                    Stein    = 1457218,
                    McMullin =  731991,
                    Others   = 1154084 
                 )))
result.2016 <- c(result.2016,
                 "No vote" = 250056000 - sum(result.2016))                 

The poststratification function expects population data to be in the form of data frames:

pop.vote16 <- data.frame(
    vote16=names(result.2016),
    Freq=result.2016)
pop.recall12 <- data.frame(
    recall12=names(result.2012),
    Freq=result.2012/sum(result.2012)*sum(result.2016)
)
save(pop.recall12,pop.vote16,file="popl-results.RData")

We poststratify the sample design object by recalled vote in 2012

anes_2016_prevote_desgn_post <- postStratify(
    anes_2016_vprevote_desgn,~recall12,population=pop.recall12)

We compare the estimated percentages of 2012 votes:

100*svymean(~recall12,design=anes_2016_vprevote_desgn)
                   mean     SE
recall12Obama   39.8844 0.0233
recall12Romney  29.4551 0.0198
recall12Other    1.1429 0.0035
recall12No vote 29.5176 0.0222
100*svymean(~recall12,design=anes_2016_prevote_desgn_post)
                    mean SE
recall12Obama   28.01970  0
recall12Romney  25.90182  0
recall12Other    0.95053  0
recall12No vote 45.12795  0

As should be expected, post-stratification eliminates the uncertainty about 2012 votes. It also corrects for turnout overreporting.

We now compare the estimated percentages of 2016 votes

100*svymean(~vote16,design=anes_2016_vprevote_desgn)
                 mean     SE
vote16Clinton 38.3334 0.0291
vote16Trump   32.8720 0.0222
vote16Other    7.5954 0.0104
vote16No vote 21.1992 0.0200
100*svymean(~vote16,design=anes_2016_prevote_desgn_post)
                 mean     SE
vote16Clinton 32.6932 0.0222
vote16Trump   32.8370 0.0152
vote16Other    6.9345 0.0101
vote16No vote 27.5352 0.0228

The percentages of Clinton voters and Trump voters are closer after poststratification.

We save the poststratified data for later use.

save(anes_2016_prevote_desgn_post,file="anes-2016-prevote-desgn-post.RData")