==========================================================
Multilevel analysis when the number of clusters is small
==========================================================
.. post:: 2019-02-01
:category: research
.. previewimage:: multilevel-ci-coverage.png
.. bibsource:: ../publications.bib
.. bibsource:: ../multilevel.bib
A small number of top-level units in multilevel analysis is a problem that has
worried comparativists for quite some time. In a paper that is forthcoming in
the *British Journal of Political Science*
:cite:`elff.et.al:multilevel.improving` my co-authors and I show that this
problem can be satisfactorily addressed by only moderate modifications of common
techniques of statistical inference: using *restricted* maximum likelihood (REML)
instead of (ML) estimators :cite:`patterson.thompson:1971` and by avoiding the assumption of asymptotic
normality for the sampling distribution of coefficient estimates and assume a
*t*-distribution instead.
(:citenop:`satterthwaite.1941`; :citenop:`kenward.roger.1997`)
.. figure:: multilevel-ci-coverage.png
:width: 100%
Performance of Likelihood-based Confidence Intervals of Upper-Level Covariate
Effect in Multilevel Linear and Probit Models
Scholars who work with cross-national comparative surveys, such as the
*Eurobarometer*, *European Social Survey*, or *Comparative Study of Electoral
Systems* often worry whether multilevel analysis of these data is possible at
all. The countries covered by such surveys form the top-level units in such
analysis, but their number is usually much smaller than the number of units in a
typical survey sample, apparently too small to conduct statistical analysis with
confidence. Indeed, methodological literature exists that suggests that
inferences drawn from multilevel analysis may suffer from serious problems:
Reported standard errors of model coefficients tend to be too small, confidence
intervals too short, and statistical hypothesis tests may lead to false
discoveries. In a widely cited article that appeared in the *American Journal of
Political Science*, Daniel StegmÃ¼ller argues that while frequentist inference
suffer from such shortcomings, Bayesian inference does not, allowing to obtain
valid results even if the number of top-level is 10 or less
:cite:`stegmueller.2013`. His simulation studies suggest that estimates of
coefficients of contextual variables may even be severly biased. In our article
we demonstrate that improvents in inference can also be achieved using
non-Bayesian or "frequentist" methods. Further we reproduce an already known proof
that coefficient estimates are unbiased in the linear-normal caste if they exist
:cite:`kackar_unbiasedness_1981`. The bias in coefficient estimates which
StegmÃ¼ller finds in his simulation study is attributed to a flaw in its design.
.. rubric:: References
.. bibliography::
:citations:
:link-details: