======================================================================================= An Extended Workflow Example - Analysing the American National Election Study of 1948 ======================================================================================= Introduction ============ This vignette gives an example for the analysis of a typical social science data set. It is the data file of the *American National Election Study* of 1948 [1]_ available from the `American National Election Studies website `__. The data file contains data from to USA-wide surveys conducted October and November 1948 by the Survey Research Centre, University Michigan (principal investigators: Angus Campbell and Robert L. Kahn). The total number of cases in the data set is 662 and the number of variables is 65 (more details about this data set can be found at http://www.electionstudies.org/studypages/1948prepost/1948prepost.htm). With 662 cases and 65 variables, the 1948 ANES data set is relatively small as compared to current social science data sets. Such larger data sets can be processed along the same lines as in this vignette. Unlike the 1948 ANES data, their size as well as, in some cases, legal restrictions prohibit the inclusion of such a data set into the package, however. This vignette starts with a demontration how a data file can be examined before loading it and how a subset of the data can be loaded into memory. After loading this subset into memory, some desciptive analyses are conducted that showcase the construction of contingency tables and of general tables of desriptive statistics using the ``genTable`` function. In addition, a logit analysis is demonstrated and the collection of several logit coefficients into a comprehensive table by the ``mtable`` function. It should be noted that the analyses reported in the following are conducted only for purpose of demonstrating the features of the package and are not to be considered of conclusive scientific evidence of any kind. This vignette is run with the help of the `*knitr* package `__. This allows to showcase not only data management facilities provided by *memisc*. The following code also demonstrates how output created with some of the facilities of *memisc* can neatly integrated in reports generated with *knitr*. Before we start, we adjust *knitr*'s output (with which this vignette is formatted) to produce HTML where possible. .. code-block:: r knit_print.codebook <-function(x,...) knitr::asis_output(format_html(x,...)) knit_print.descriptions <-function(x,...) knitr::asis_output(format_html(x,...)) knit_print.ftable <-function(x,options,...) knitr::asis_output( format_html(x, digits=if(length(options$ftable.digits)) options$ftable.digits else 0, ...)) # We can now adjust the number of digits after the comma # for each column e.g. by adding an `ftable.digits` option # to an R chunk, as in ```{r,ftable=c(2,2,0)} knit_print.mtable <-function(x,...) knitr::asis_output(format_html(x,...)) Reading In a "Portable" SPSS Data File ====================================== We start with importing the data into R. The following code extracts the SPSS portable file ``NES1948.POR`` from zip file ``NES1948.ZIP`` delivered with the *memisc* package. .. code-block:: r library(memisc) options(digits=3) nes1948.por <- unzip(system.file("anes/NES1948.ZIP",package="memisc"), "NES1948.POR",exdir=tempfile()) Now the portable file is in a temporary directory and the path to the file is contained in the string variable ``nes1948.por``. In the next step, the file is declared as a SPSS/PSPP portable file using the function ``spss.portable.file``, which as first argument takes the path to the file. ``spss.portable.file`` reads in the information about the variables contained in the data set and counts the number of cases in the file. That is, standard I/O operations are used on the file, but the data read in are just thrown away without allocating core memory for the data. This counting of cases can, of course, be suppressed if it would take too long. .. code-block:: r nes1948 <- spss.portable.file(nes1948.por) print(nes1948) :: SPSS portable file '/tmp/RtmpZgFPXu/file3e972ed49d6/NES1948.POR' with 67 variables and 662 observations At this stage, the data are not loaded into the memory yet. But we can see which variables exist inside the data set: .. code-block:: r names(nes1948) :: [1] "vversion" "vdsetno" "v480001" "v480002" "v480003" "v480004" [7] "v480005" "v480006" "v480007" "v480008" "v480009" "v480010" [13] "v480011" "v480012" "v480013" "v480014a" "v480014b" "v480015a" [19] "v480015b" "v480016a" "v480016b" "v480017a" "v480017b" "v480018" [25] "v480019" "v480020" "v480021a" "v480021b" "v480022a" "v480022b" [31] "v480023" "v480024" "v480025a" "v480025b" "v480026" "v480027" [37] "v480028" "v480029" "v480030" "v480031a" "v480031b" "v480031c" [43] "v480032a" "v480032b" "v480032c" "v480033a" "v480033b" "v480034a" [49] "v480034b" "v480035a" "v480035b" "v480036a" "v480036b" "v480037" [55] "v480038" "v480039" "v480040" "v480041" "v480042" "v480043" [61] "v480044" "v480045" "v480046" "v480047" "v480048" "v480049" [67] "v480050" Note that the variable names are all changed from uppercase to lowercase (SPSS does not distinguish uppercase and lowercase variable names and uppercase looks like shouting). Casefolding could have been suppressed by the call ``spsp.portable.file(nes1948.por,tolower=FALSE)``. We also can ask for a description ("variable label") for each variable: .. code-block:: r description(nes1948) .. raw:: html
VariableDescription
vversionNES VERSION NUMBER
vdsetnoNES DATASET NUMBER
v480001ICPSR ARCHIVE NUMBER
v480002INTERVIEW NUMBER
v480003POP CLASSIFICATION
v480004CODER
v480005NUMBER OF CALLS TO R
v480006R REMEMBER PREVIOUS INT
v480007INTR INTERVIEW THIS R
v480008PRVS PRE-ELCTN R REINT
v480009R INT IN PRE/POSTELCTN
v480010RENT CNTRL KEPT/DROPPED
v480011GOVT CONTROL PRICES
v480012WHAT TO DO W TFT-HT ACT
v480013PRESLELCTN OTCM SURPRISE
v480014aWHY PPL VTD FOR TRUMAN 1
v480014bWHY PPL VTD FOR TRUMAN 2
v480015aWHY PPL VTD AGNST TRUMAN 1
v480015bWHY PPL VTD AGNST TRUMAN 2
v480016aWHY PPL VTD FOR DEWEY 1
v480016bWHY PPL VTD FOR DEWEY 2
v480017aWHY PPL VTD AGNST DEWEY 1
v480017bWHY PPL VTD AGNST DEWEY 2
v480018DID R VOTE/FOR WHOM
v480019WN DECIDE FOR WHOM TO VT
v480020CNSD VT FOR SOMEONE ELSE
v480021aXWHY DID NOT VT FOR HIM 1
v480021bXWHY DID NOT VT FOR HIM 2
v480022aWHY VT THE WAY YOU DID 1
v480022bWHY VT THE WAY YOU DID 2
v480023VOTED STRAIGHT TICKET
v480024R NOT VT-IF VT,FOR WHOM
v480025aR NOT VT-WHY DID NOT VT 1
v480025bR NOT VT-WHY DID NOT VT 2
v480026R NOT VT-WAS R REG TO VT
v480027VTD IN PRVS PRESL ELCTN
v480028VTD FOR WHOM IN 1944
v480029OCCUPATION OF HEAD
v480030HEAD BELONG TO LBR UN
v480031aGRPS IDENTIFIED W TRUMAN 1
v480031bGRPS IDENTIFIED W TRUMAN 2
v480031cGRPS IDENTIFIED W TRUMAN 3
v480032aGRPS IDENTIFIED W DEWEY 1
v480032bGRPS IDENTIFIED W DEWEY 2
v480032cGRPS IDENTIFIED W DEWEY 3
v480033aISSUES CONNECTED W TRMN 1
v480033bISSUES CONNECTED W TRMN 2
v480034aISSUES CONNECTED W DEWEY 1
v480034bISSUES CONNECTED W DEWEY 2
v480035aPERSONAL ATTRIBUTE TRMN 1
v480035bPERSONAL ATTRIBUTE TRMN 2
v480036aPERSONAL ATTRIBUTE DEWEY 1
v480036bPERSONAL ATTRIBUTE DEWEY 2
v480037CMPN INCIDENTS MENTIONED
v48003841-PRESLELCTN PLAN TO VT
v48003941-PLAN TO VT REP/DEM
v48004041-USA'S CNCRN W OTHERS
v48004141-SATISD USA TWRD RUSS
v48004241-INFORMATION LEVEL
v48004341-USA GV IN,AGRT RUSS
v48004441-USA-RUSS AGRT VIA U.N
v480045SEX OF RESPONDENT
v480046RACE OF RESPONDENT
v480047AGE OF RESPONDENT
v480048EDUCATION OF RESPONDENT
v480049TOTAL 1948 INCOME
v480050RELIGIOUS PREFERENCE
or even a code book using .. code-block:: r codebook(nes1948) (this is not shown here because the output would have taken more then thirty pages). We can also get a codebook of the first few variabels instead, with .. code-block:: r codebook(nes1948[1:5]) .. raw:: html

vversion'NES VERSION NUMBER'

Storage mode:double
Measurement:interval

Min:1.000
Max:1.000
Mean:1.000
Std.Dev.:0.000
Skewness: NaN
Kurtosis: NaN

vdsetno'NES DATASET NUMBER'

Storage mode:character
Measurement:nominal

Values and labelsNPercent
(unlab.vld.)662100.0100.0

v480001'ICPSR ARCHIVE NUMBER'

Storage mode:double
Measurement:interval

Min:7218.000
Max:7218.000
Mean:7218.000
Std.Dev.:0.000
Skewness: NaN
Kurtosis: NaN

v480002'INTERVIEW NUMBER'

Storage mode:double
Measurement:interval

Min:1001.000
Max:1662.000
Mean:1331.500
Std.Dev.:191.103
Skewness:0.000
Kurtosis:-1.200

v480003'POP CLASSIFICATION'

Storage mode:double
Measurement:nominal

Values and labelsNPercent
1'METROPOLITAN AREA'18227.527.5
2'TOWN OR CITY'35453.553.5
3'OPEN COUNTRY'12619.019.0

Reading In a Subset of the Data ------------------------------- After we have decided which variables to use we can read in a subset of the data: .. code-block:: r vote.48 <- subset(nes1948, select=c( v480018, v480029, v480030, v480045, v480046, v480047, v480048, v480049, v480050 )) The subset of the ANES 1948 we read in is now contained in the variable ``vote.48``, which contains an object of class ``data.set``. A ``data.set`` is an "embellished" version of a ``data.frame``, a data structure intended to contained ``labelled`` vectors. ``labelled`` vectors contain the all the special information attached to the variables in the original data set, such as variable labels, value labels, and general missing values. A short summary of this special information shows up after a call to ``str``. .. code-block:: r str(vote.48) :: Data set with 662 obs. of 9 variables: $ v480018: Nmnl. item w/ 7 labels for 1,2,3,... + ms.v. num 1 2 1 2 1 2 2 1 2 1 ... $ v480029: Nmnl. item w/ 12 labels for 10,20,30,... + ms.v. num 70 30 40 10 10 20 80 80 40 40 ... $ v480030: Nmnl. item w/ 4 labels for 1,2,8,... + ms.v. num 1 2 2 2 2 2 2 2 1 1 ... $ v480045: Nmnl. item w/ 3 labels for 1,2,9 + ms.v. num 1 2 2 2 1 2 1 2 1 1 ... $ v480046: Nmnl. item w/ 4 labels for 1,2,3,... + ms.v. num 1 1 1 1 1 1 1 1 1 1 ... $ v480047: Nmnl. item w/ 7 labels for 1,2,3,... + ms.v. num 3 3 2 3 2 3 4 5 2 2 ... $ v480048: Nmnl. item w/ 4 labels for 1,2,3,... + ms.v. num 1 2 2 3 3 2 1 1 2 2 ... $ v480049: Nmnl. item w/ 8 labels for 1,2,3,... + ms.v. num 4 7 5 7 5 7 5 2 5 6 ... $ v480050: Nmnl. item w/ 6 labels for 1,2,3,... + ms.v. num 1 1 2 1 2 1 1 1 1 2 ... This output shows, for example, that variable ``v480018`` has the description (variable label) "DID R VOTE/FOR WHOM" is considered as having nominal level of measurement, has seven value labels and one defined missing value. Since the variable names in the ANES data set are not very mnemonic, we rename the variables: .. code-block:: r vote.48 <- rename(vote.48, v480018 = "vote", v480029 = "occupation.hh", v480030 = "unionized.hh", v480045 = "gender", v480046 = "race", v480047 = "age", v480048 = "education", v480049 = "total.income", v480050 = "religious.pref" ) Since many data sets available from public repositories have such non-mnemonic variable names as in this example, it might be convenient to do the data loading and renaming in one step. Indeed it is possible: .. code-block:: r vote.48 <- subset(nes1948, select=c( vote = v480018, occupation.hh = v480029, unionized.hh = v480030, gender = v480045, race = v480046, age = v480047, education = v480048, total.income = v480049, religious.pref = v480050 )) Before we start with analyses, we take a closer look at the data. .. code-block:: r codebook(vote.48) .. raw:: html

vote'DID R VOTE/FOR WHOM'

Storage mode:double
Measurement:nominal
Missing values:9

Values and labelsNPercent
1'VOTED - FOR TRUMAN'21232.132.0
2'VOTED - FOR DEWEY'17827.026.9
3'VOTED - FOR WALLACE'10.20.2
4'VOTED - FOR OTHER'111.71.7
5'VOTED - NA FOR WHOM'203.03.0
6'DID NOT VOTE'23836.136.0
9M'NA WHETHER VOTED'20.3

occupation.hh'OCCUPATION OF HEAD'

Storage mode:double
Measurement:nominal
Missing values:99

Values and labelsNPercent
10'PROFESSIONAL, SEMI-PROFESSIONAL'446.96.6
20'SELF-EMPLOYED, MANAGERIAL, SUPERVISORY'7311.511.0
30'OTHER WHITE-COLLAR (CLERICAL, SALES, ET'7912.511.9
40'SKILLED AND SEMI-SKILLED'16425.924.8
60'PROTECTIVE SERVICE'60.90.9
70'UNSKILLED, INCLUDING FARM AND SERVICE W'8513.412.8
80'FARM OPERATORS AND MANAGERS'10516.615.9
92'STUDENT'71.11.1
94'UNEMPLOYED'50.80.8
95'RETIRED, TOO OLD OR UNABLE TO WORK'386.05.7
96'HOUSEWIFE'284.44.2
99M'NA'284.2

unionized.hh'HEAD BELONG TO LBR UN'

Storage mode:double
Measurement:nominal
Missing values:8-Inf

Values and labelsNPercent
1'YES'15023.322.7
2'NO'49376.774.5
8M'DK'50.8
9M'NA'142.1

gender'SEX OF RESPONDENT'

Storage mode:double
Measurement:nominal
Missing values:9

Values and labelsNPercent
1'MALE'30245.845.6
2'FEMALE'35754.253.9
9M'NA'30.5

race'RACE OF RESPONDENT'

Storage mode:double
Measurement:nominal
Missing values:9

Values and labelsNPercent
1'WHITE'58590.788.4
2'NEGRO'609.39.1
3'OTHER'00.00.0
9M'NA'172.6

age'AGE OF RESPONDENT'

Storage mode:double
Measurement:nominal
Missing values:9

Values and labelsNPercent
1'18-24'578.78.6
2'25-34'14221.721.5
3'35-44'17426.626.3
4'45-54'12519.118.9
5'55-64'8613.113.0
6'65 AND OVER'7010.710.6
9M'NA'81.2

education'EDUCATION OF RESPONDENT'

Storage mode:double
Measurement:nominal
Missing values:9

Values and labelsNPercent
1'GRADE SCHOOL'29244.444.1
2'HIGH SCHOOL'26640.440.2
3'COLLEGE'10015.215.1
9M'NA'40.6

total.income'TOTAL 1948 INCOME'

Storage mode:double
Measurement:nominal
Missing values:9

Values and labelsNPercent
1'UNDER $500'253.83.8
2'$500-$999'436.66.5
3'$1000-1999'11016.816.6
4'$2000-2999'18528.227.9
5'$3000-3999'14221.721.5
6'$4000-4999'6610.110.0
7'$5000 AND OVER'8412.812.7
9M'NA'71.1

religious.pref'RELIGIOUS PREFERENCE'

Storage mode:double
Measurement:nominal
Missing values:9

Values and labelsNPercent
1'PROTESTANT'46070.069.5
2'CATHOLIC'14021.321.1
3'JEWISH'253.83.8
4'OTHER'142.12.1
5'NONE'182.72.7
9M'NA'50.8

We now have obtained a *codebook*, which contains information of the class and type of the variables in the data set, the value labels and defined missing values, and counts of the distinct values of the variables. We now have obtained a *codebook*, which contains information of the class and type of the variables in the data set, the value labels and defined missing values, and counts of the distinct values of the variables. Analysis ======== Some Descriptive Analyses ------------------------- We start our analyses with a contingency table, but first we make some preparations: We recode the variables of interest into a smaller number of categories in order to get results that are easier to read and interpret. .. code-block:: r vote.48 <- within(vote.48,{ vote3 <- recode(vote, 1 -> "Truman", 2 -> "Dewey", 3:4 -> "Other" ) occup4 <- recode(occupation.hh, 10:20 -> "Upper white collar", 30 -> "Other white collar", 40:70 -> "Blue collar", 80 -> "Farmer" ) relig3 <- recode(religious.pref, 1 -> "Protestant", 2 -> "Catholic", 3:5 -> "Other,none" ) race2 <- recode(race, 1 -> "White", 2 -> "Black" ) }) Having constructed the unordered factors ``vote3``, ``occup4``, ``relig3``, and ``race2`` we can proceed examining the association the vote, occupational class, relgious denomination, and race. First, we look upon a simple contingency table. .. code-block:: r ftable(xtabs(~vote3+occup4,data=vote.48)) .. raw:: html
occup4
vote3Upper white collarOther white collarBlue collarFarmer
Truman173011426
Dewey67313614
Other2043
Tables of percentages may seem more informative about the impact of various factors on the vote. So we use the function ``genTable`` to obtain such tables of percentages: .. code-block:: r gt1 <- genTable(percent(vote3)~occup4,data=vote.48) ## For knitr-ing, we use ```{r, ftable.digits=c(2,2,2,0)} here. ftable(gt1,row.vars=2) .. raw:: html
occup4TrumanDeweyOtherN
Upper white collar19.7777.912.3386
Other white collar49.1850.820.0061
Blue collar74.0323.382.60154
Farmer60.4732.566.9843
Obviously, voters from farmer and blue collar worker households were especially supportive of President Truman, while voters of upper white collar background largely supported the Republican Candidate Dewey. .. code-block:: r gt2 <- genTable(percent(vote3)~relig3,data=vote.48) ftable(gt2,row.vars=2) .. raw:: html
relig3TrumanDeweyOtherN
Protestant44.7150.984.31255
Catholic66.0233.980.00103
Other,none68.1829.552.2744
This table shows that Catholics and adherents of other denominations were more supportive of Truman than of Dewey. .. code-block:: r gt3 <- genTable(percent(vote3)~race2,data=vote.48) ftable(gt3,row.vars=2) .. raw:: html
race2TrumanDeweyOtherN
White51.3345.483.19376
Black64.7135.290.0017
African Americans apparently supported Truman by a large majority. The number of members of this group in the sample is very small, however, so that such an inference would be very shaky. .. code-block:: r gt4 <- genTable(percent(vote3)~total.income,data=vote.48) ftable(gt4,row.vars=2) .. raw:: html
total.incomeTrumanDeweyOtherN
UNDER $50050.0050.000.008
$500-$99961.5438.460.0013
$1000-199964.4132.203.3959
$2000-299966.9930.102.91103
$3000-399947.5248.513.96101
$4000-499945.8350.004.1748
$5000 AND OVER31.8268.180.0066
The table of percentage of vote by income suggests that income had some considerable influence on the choice either of Truman or of Dewey, but the unequal distribution of income categories warrants a more refined analysis that takes into account the uncertainty about the vote percentages. Therefore, the percentages of support for Truman broken down by income shown with confidence intervals: .. code-block:: r ## For knitr-ing, we use ```{r, ftable.digits=c(2,2,2)} here. inc.tab <- genTable(percent(vote3,ci=TRUE)~total.income,data=vote.48) ftable(inc.tab,row.vars=c(3,2)) .. raw:: html
total.incomePercentagelowerupper
UNDER $500Truman50.0015.7084.30
Dewey50.0015.7084.30
Other0.000.0036.94
$500-$999Truman61.5431.5886.14
Dewey38.4613.8668.42
Other0.000.0024.71
$1000-1999Truman64.4150.8776.45
Dewey32.2020.6245.64
Other3.390.4111.71
$2000-2999Truman66.9957.0375.94
Dewey30.1021.4539.92
Other2.910.608.28
$3000-3999Truman47.5237.4957.70
Dewey48.5138.4558.67
Other3.961.099.83
$4000-4999Truman45.8331.3760.83
Dewey50.0035.2364.77
Other4.170.5114.25
$5000 AND OVERTruman31.8220.8944.44
Dewey68.1855.5679.11
Other0.000.005.44
Occupational class is more evenly distributed in the sample, thus it may be possible to obtain more precise estimates of the percentages of support for Truman for occupational classes: .. code-block:: r occup.tab <- genTable(percent(vote3,ci=TRUE)~occup4,data=vote.48) ftable(occup.tab,row.vars=c(3,2)) .. raw:: html
occup4Percentagelowerupper
Upper white collarTruman19.7711.9629.75
Dewey77.9167.6786.14
Other2.330.288.15
Other white collarTruman49.1836.1462.30
Dewey50.8237.7063.86
Other0.000.005.87
Blue collarTruman74.0366.3580.75
Dewey23.3816.9430.86
Other2.600.716.52
FarmerTruman60.4744.4175.02
Dewey32.5619.0848.54
Other6.981.4619.06
The upper and lower white-collar and blue-collar classes are quite distinct with regard to the percentages of support for Truman. The point estimates of the percentages are outside the confidence intervals of the respective other occupational classes, the confidence intervals do not even overlap. However, it is not clear whether farmers are distinct from the blue-collar and lower white-collar classes. Logit Modelling of Candidate Choice ----------------------------------- In the following we conduct a logit analysis of the vote for Truman. First, we assign non-standard contrasts the categorical predictors. Here, the function ``contr`` is used to assign treatment (dummy) contrasts to ``occup4`` and ``total.income`` with baseline category 3 and 4, respectively. .. code-block:: r vote.48 <- within(vote.48,{ contrasts(occup4) <- contr("treatment",base = 3) contrasts(total.income) <- contr("treatment",base = 4) }) We now fit some logistic regression models of the impact occupational class, income, and religious denomination on the vote choice supporting Truman. The contrasts of the occupational class and income factors are such that they compare the choices of the members of the blue-collar class with all other classes and the middle income group ($ 2000-2999) with the other income groups. The religious denomination factor compares Protestants with Catholics and those with other or no denominations. .. code-block:: r model1 <- glm((vote3=="Truman")~occup4,data=vote.48, family="binomial") model2 <- glm((vote3=="Truman")~total.income,data=vote.48, family="binomial") model3 <- glm((vote3=="Truman")~occup4+total.income,data=vote.48, family="binomial") model4 <- glm((vote3=="Truman")~relig3,data=vote.48, family="binomial") model5 <- glm((vote3=="Truman")~occup4+relig3,data=vote.48, family="binomial") First, we use ``mtable`` to construct a comparative table of the estimates of ``model1``, ``model2``, and ``model3``. We thus can compare the impact of occupational class and income on the choice of candidate Truman. .. code-block:: r mtable(model1,model2,model3,summary.stats=c("Nagelkerke R-sq.","Deviance","AIC","N")) .. raw:: html
model1model2model3
(Intercept)1.047***0.708***1.316***
(0.184)(0.210)(0.268)
occup4: Upper white collar/Blue collar−2.448***−2.328***
(0.327)(0.357)
occup4: Other white collar/Blue collar−1.080***−1.015**
(0.315)(0.323)
occup4: Farmer/Blue collar−0.622−0.792*
(0.362)(0.383)
total.income: UNDER $500/$2000-2999−0.708−0.662
(0.737)(1.056)
total.income: $500-$999/$2000-2999−0.2380.912
(0.607)(1.143)
total.income: $1000-1999/$2000-2999−0.1150.144
(0.343)(0.440)
total.income: $3000-3999/$2000-2999−0.807**−0.527
(0.289)(0.338)
total.income: $4000-4999/$2000-2999−0.875*−0.509
(0.358)(0.411)
total.income: $5000 AND OVER/$2000-2999−1.470***−0.535
(0.337)(0.405)
Nagelkerke R-sq.0.20.10.3
Deviance404.2524.4390.6
AIC412.2538.4410.6
N344398340
``mtable`` returns an object of class ``"mtable"``. When formatted it looks close to the requirements of typical social science publications. Yet at least we want to change the technical variable names into non-technical ones, for which we can use ``relabel``: .. code-block:: r relabel(mtable( "Model 1"=model1, "Model 2"=model2, "Model 3"=model3, summary.stats=c("Nagelkerke R-sq.","Deviance","AIC","N")), UNDER="under", "AND OVER"="and over", occup4="Occup. class", total.income="Income", gsub=TRUE ) .. raw:: html
Model 1Model 2Model 3
(Intercept)1.047***0.708***1.316***
(0.184)(0.210)(0.268)
Occup. class: Upper white collar/Blue collar−2.448***−2.328***
(0.327)(0.357)
Occup. class: Other white collar/Blue collar−1.080***−1.015**
(0.315)(0.323)
Occup. class: Farmer/Blue collar−0.622−0.792*
(0.362)(0.383)
Income: under $500/$2000-2999−0.708−0.662
(0.737)(1.056)
Income: $500-$999/$2000-2999−0.2380.912
(0.607)(1.143)
Income: $1000-1999/$2000-2999−0.1150.144
(0.343)(0.440)
Income: $3000-3999/$2000-2999−0.807**−0.527
(0.289)(0.338)
Income: $4000-4999/$2000-2999−0.875*−0.509
(0.358)(0.411)
Income: $5000 and over/$2000-2999−1.470***−0.535
(0.337)(0.405)
Nagelkerke R-sq.0.20.10.3
Deviance404.2524.4390.6
AIC412.2538.4410.6
N344398340
The comparison of the pseudo-R-Square values of model 1 and 2 suggests that occupational class has a stronger influence on a preference for Truman than household income. Indeed, if occupational class is taken into account, the effect of income is no longer statistically significant as the column corresponding to model 3 indicates. Second, we compare the effect of occupational class and religious denomination on the preference for Truman along the same lines as above. We use ``mtable`` to collect the estimates of ``model1``, ``model4``, and ``model5`` into a common table. .. code-block:: r relabel(mtable( "Model 1"=model1, "Model 4"=model4, "Model 5"=model5, summary.stats=c("Nagelkerke R-sq.","Deviance","AIC","N")), occup4="Occup. class", relig3="Religion", gsub=TRUE ) .. raw:: html
Model 1Model 4Model 5
(Intercept)1.047***−0.2130.698**
(0.184)(0.126)(0.216)
Occup. class: Upper white collar/Blue collar−2.448***−2.385***
(0.327)(0.337)
Occup. class: Other white collar/Blue collar−1.080***−1.098***
(0.315)(0.326)
Occup. class: Farmer/Blue collar−0.622−0.346
(0.362)(0.374)
Religion: Catholic/Protestant0.877***0.685*
(0.243)(0.292)
Religion: Other,none/Protestant0.975**1.191**
(0.347)(0.441)
Nagelkerke R-sq.0.20.10.3
Deviance404.2537.7393.1
AIC412.2543.7405.1
N344402344
A comparison of the pseudo-R-squared values suggests that also the effect of religious denomination is weaker than that of occupational class. However, as the third column in the above table indicates the effect of religious denomination remains statistically significant. .. [1] National Election Studies, 1948: *Post-Election Study [dataset].* Ann Arbor, MI: University of Michigan, Center for Political Studies [producer and distributor], 1999. ANES Dataset ID: 1948.T; ICPSR Study Number: 7218. These materials are based on work supported by the National Science Foundation under Grant Nos.: SBR-9707741, SBR-9317631, SES-9209410, SES-9009379, SES-8808361, SES-8341310, SES-8207580, and SOC77-08885. Any opinions, findings and conclusions or recommendations expressed in these materials are those of the author(s) and do not necessarily reflect those of the National Science Foundation.