Baseline-Category Logit Models for Categorical and Multinomial Responses¶
Description¶
The function mblogit
fits baseline-category logit models for categorical and
multinomial count responses with fixed alternatives.
Usage¶
mblogit(
formula,
data = parent.frame(),
random = NULL,
subset,
weights = NULL,
na.action = getOption("na.action"),
model = TRUE,
x = FALSE,
y = TRUE,
contrasts = NULL,
method = NULL,
estimator = c("ML", "REML"),
dispersion = FALSE,
from.table = FALSE,
groups = NULL,
control = if (length(random)) mmclogit.control(...) else mclogit.control(...),
...
)
Arguments¶
formula
-
the model formula. The response must be a factor or a matrix of counts.
data
-
an optional data frame, list or environment (or object coercible by
as.data.frame
to a data frame) containing the variables in the model. If not found indata
, the variables are taken fromenvironment(formula)
, typically the environment from whichglm
is called. random
-
an optional formula that specifies the random-effects structure or NULL.
subset
-
an optional vector specifying a subset of observations to be used in the fitting process.
weights
-
an optional vector of weights to be used in the fitting process. Should be
NULL
or a numeric vector. na.action
-
a function which indicates what should happen when the data contain
NA``s. The default is set by the ``na.action
setting ofoptions
, and isna.fail
if that is unset. The ‘factory-fresh’ default isna.omit
. Another possible value isNULL
, no action. Valuena.exclude
can be useful. model
-
a logical value indicating whether model frame should be included as a component of the returned value.
-
x
,y
-
logical values indicating whether the response vector and model matrix used in the fitting process should be returned as components of the returned value.
contrasts
-
an optional list. See the
contrasts.arg
ofmodel.matrix.default
. method
-
NULL
or a character string, either “PQL” or “MQL”, specifies the type of the quasilikelihood approximation to be used if a random-effects model is to be estimated. estimator
-
a character string; either “ML” or “REML”, specifies which estimator is to be used/approximated.
dispersion
-
a logical value or a character string; whether and how a dispersion parameter should be estimated. For details see
dispersion
. from.table
-
a logical value; do the data represent a contingency table, e.g. were created by applying
as.data.frame()
a the result oftable()
orxtabs()
. This relevant only if the response is a factor. This argument should be set toTRUE
if the data do come from a contingency table. Correctly settingfrom.table=TRUE
in this case, will lead to efficiency gains in computing, but more importantly overdispersion will correctly be computed if present. groups
-
an optional formula that specifies groups of observations relevant for the specification of overdispersed response counts.
control
-
a list of parameters for the fitting process. See
mclogit.control
...
-
arguments to be passed to
mclogit.control
ormmclogit.control
Value¶
mblogit
returns an object of class “mblogit”, which has almost the same structure as
an object of class “glm”. The difference are the components coefficients
,
residuals
, fitted.values
, linear.predictors
, and y
, which are matrices
with number of columns equal to the number of response categories minus one.
Details¶
The function mblogit
internally rearranges the data into a ‘long’ format and uses
mclogit.fit
to compute estimates. Nevertheless, the ‘user data’ is unaffected.
See also¶
The function multinom
in package nnet [pkg] also fits multinomial baseline-category
logit models, but has a slightly less convenient output and does not support
overdispersion or random effects. However, it provides some other options.
Baseline-category logit models are also supported by the package VGAM [pkg], as well as
some reduced-rank and (semi-parametric) additive generalisations. The package mnlogit
[pkg] estimates logit models in a way optimized for large numbers of alternatives.
Examples¶
library(MASS) # For 'housing' data
library(nnet)
library(memisc)
(house.mult<- multinom(Sat ~ Infl + Type + Cont, weights = Freq,
data = housing))
# weights: 24 (14 variable)
initial value 1846.767257
iter 10 value 1747.045232
final value 1735.041933
converged
Call:
multinom(formula = Sat ~ Infl + Type + Cont, data = housing,
weights = Freq)
Coefficients:
(Intercept) InflMedium InflHigh TypeApartment TypeAtrium TypeTerrace ContHigh
Medium -0.4192316 0.4464003 0.6649367 -0.4356851 0.1313663 -0.6665728 0.3608513
High -0.1387453 0.7348626 1.6126294 -0.7356261 -0.4079808 -1.4123333 0.4818236
Residual Deviance: 3470.084
AIC: 3498.084
(house.mblogit <- mblogit(Sat ~ Infl + Type + Cont, weights = Freq,
data = housing))
Iteration 1 - Deviance = 3493.764
Iteration 2 - Deviance = 3470.111
Iteration 3 - Deviance = 3470.084
Iteration 4 - Deviance = 3470.084
converged
Call: mblogit(formula = Sat ~ Infl + Type + Cont, data = housing, weights =
Freq)
Coefficients:
Predictors
Response categories (Intercept) InflMedium InflHigh TypeApartment TypeAtrium
TypeTerrace
Medium/Low -0.4192 0.4464 0.6649 -0.4357 0.1314 -0.6666
High/Low -0.1387 0.7349 1.6126 -0.7356 -0.4080 -1.4123
Predictors
Response categories ContHigh
Medium/Low 0.3609
High/Low 0.4818
Null Deviance: 3694
Residual Deviance: 3470
summary(house.mult)
Call:
multinom(formula = Sat ~ Infl + Type + Cont, data = housing,
weights = Freq)
Coefficients:
(Intercept) InflMedium InflHigh TypeApartment TypeAtrium TypeTerrace ContHigh
Medium -0.4192316 0.4464003 0.6649367 -0.4356851 0.1313663 -0.6665728 0.3608513
High -0.1387453 0.7348626 1.6126294 -0.7356261 -0.4079808 -1.4123333 0.4818236
Std. Errors:
(Intercept) InflMedium InflHigh TypeApartment TypeAtrium TypeTerrace ContHigh
Medium 0.1729344 0.1415572 0.1863374 0.1725327 0.2231065 0.2062532 0.1323975
High 0.1592295 0.1369380 0.1671316 0.1552714 0.2114965 0.2001496 0.1241371
Residual Deviance: 3470.084
AIC: 3498.084
summary(house.mblogit)
Call:
mblogit(formula = Sat ~ Infl + Type + Cont, data = housing, weights = Freq)
Equation for Medium vs Low:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -0.4192 0.1729 -2.424 0.015342 *
InflMedium 0.4464 0.1416 3.153 0.001613 **
InflHigh 0.6649 0.1863 3.568 0.000359 ***
TypeApartment -0.4357 0.1725 -2.525 0.011562 *
TypeAtrium 0.1314 0.2231 0.589 0.555980
TypeTerrace -0.6666 0.2063 -3.232 0.001230 **
ContHigh 0.3609 0.1324 2.726 0.006420 **
Equation for High vs Low:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -0.1387 0.1592 -0.871 0.383570
InflMedium 0.7349 0.1369 5.366 8.03e-08 ***
InflHigh 1.6126 0.1671 9.649 < 2e-16 ***
TypeApartment -0.7356 0.1553 -4.738 2.16e-06 ***
TypeAtrium -0.4080 0.2115 -1.929 0.053730 .
TypeTerrace -1.4123 0.2001 -7.056 1.71e-12 ***
ContHigh 0.4818 0.1241 3.881 0.000104 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Null Deviance: 3694
Residual Deviance: 3470
Number of Fisher Scoring iterations: 4
Number of observations: 1681
mtable(house.mblogit)
Calls:
house.mblogit: mblogit(formula = Sat ~ Infl + Type + Cont, data = housing,
weights = Freq)
=================================================
Medium/Low High/Low
-------------------------------------------------
(Intercept) -0.419* -0.139
(0.173) (0.159)
Infl: Medium/Low 0.446** 0.735***
(0.142) (0.137)
Infl: High/Low 0.665*** 1.613***
(0.186) (0.167)
Type: Apartment/Tower -0.436* -0.736***
(0.173) (0.155)
Type: Atrium/Tower 0.131 -0.408
(0.223) (0.211)
Type: Terrace/Tower -0.667** -1.412***
(0.206) (0.200)
Cont: High/Low 0.361** 0.482***
(0.132) (0.124)
-------------------------------------------------
Log-likelihood -1735.0
N 1681
=================================================
Significance: *** = p < 0.001;
** = p < 0.01; * = p < 0.05