Software
memisc: Management of Survey Data and Presentation of Analysis Results
memisc
The R package memisc, which is available at CRAN, provides tools for the management of survey data, graphics, statistics and simulation.
One of the aims of this package is to make life easier for useRs who deal with survey data sets. It provides an infrastructure for the management of survey data including value labels, definable missing values, recoding of variables, production of code books, and import of (subsets of) SPSS and Stata files. Further, it provides functionality to produce tables and data frames of arbitrary descriptive statistics and (almost) publication-ready tables of regression model estimates. Also some convenience tools for graphics, programming, and simulation are provided.
Development occurs on GitHub, where both releases and the development tree can be found.
memisc
mclogit: Mixed conditional logit models in R
The package ‘mclogit’ implements the estimation of mixed conditional logit models via the PQL method as used in my article published in Electoral Studies. It is published on CRAN. Development occurs on GitHub, where both releases and the development tree can be found.
The probability that individual \(i\) chooses alternative \(j\) from choice set \(\mathcal{S}_i\) is \[ \pi_{ij} = \frac{\exp(\eta_{ij})}{\sum_{k\in\mathcal{S}_i}\exp(\eta_{ik})} \] with \[ \eta_{ij}=\beta_1x_{1ij}+\cdots+\beta_qx_{qij}+U_{ij} \] where \(x_{hij}\) are values of independent variables, \(\beta_h\) are parameters (coefficients), and \(U_{ij}\) are random effects with a normal distribution.
The package allows to specify that random effects are equal for all individuals within clusters \(\mathcal{C}_c\), that is \(U_{i_1j}=U_{i_2j}\) for \(i_1\in\mathcal{C}_c\) and \(i_2\in\mathcal{C}_c\), where such clusters also may be nested in a “multi-level” manner.
The “dependent variable” \(y_{ij}\) may be a “dummy variable” that is equal to 1 if individual \(i\) has chosen alternative \(j\) and 0 if s/he has chosen another alternative. For example, if all individuals \(i\) face the same set of five alternatives, then five values of the dependent variable would correspond to each individual with only one of the values being equal to one and the other four values being equal to zero. (This is sometimes called that the data are in “stacked” format.)
Also, if “covariate classes” of individuals are formed that share the same values of the independent variables and are members of the same cluster, and \(i\) indicates such a covariate class, \(y_{ij}\) may be the count of individuals from covariate class \(i\) that have chosen alternative \(j\).
munfold: Metric multidimensional unfolding in R
Multi-dimensional unfolding is a procedure to recover positions of two sets of points from a matrix of distances between the points of both groups. For some of my research work, I implemented the algorithm for metric multi-dimensional unfolding that Peter Schönemann published back in 1970 in Psychometrika. This is not the most advanced algorithm, however it is quite robust and quick when a large number of points is involved. Note that it is an algorithm for metric unfolding. That is, the input data need to be interpretable as real distances.
The package that contains my implementation of Schönemann’s algorithm is called munfold. It is now published on CRAN. Development occurs on GitHub, where both releases and the development tree can be found.