codeplan memisc 0.99.25.4

Describe structure of Data Sets and Importers

Description

The function codeplan() creates a data frame that describes the structure of an item list (a data.set object or an importer object), so that this structure can be stored and and recovered. The resulting data frame has a particular print method that delimits the output to one line per variable.

With setCodeplan an item list structure (as returned by codeplan()) can be applied to a data frame or data set. It is also possible to use an assignment like codeplan(x) <- value to a similar effect.

Usage

codeplan(x)
## S4 method for signature 'item.list'
codeplan(x)
## S4 method for signature 'item'
codeplan(x)
setCodeplan(x,value)
## S4 method for signature 'data.frame,codeplan'
setCodeplan(x,value)
## S4 method for signature 'data.set,codeplan'
setCodeplan(x,value)
## S4 method for signature 'data.set,NULL'
setCodeplan(x,value)
## S4 method for signature 'item,codeplan'
setCodeplan(x,value)
## S4 method for signature 'item,NULL'
setCodeplan(x,value)
## S4 method for signature 'atomic,codeplan'
setCodeplan(x,value)
## S4 method for signature 'atomic,NULL'
setCodeplan(x,value)
codeplan(x) <- value

Arguments

x

for codeplan(x) an object that inherits from class "item.list", i.e. can be a "data.set" object or an "importer" object, it can also be an object that inherits from class "item"

value

an object as it would be returned by codeplan(x) or NULL.

Value

If applicable, codeplan returns a data frame with additional S3 class attribute "codeplan". For arguments for which the relevant information does not exist, the function returns NULL. Such a data frame has the following variables:

name

The name of the item/variable in the item list or data set.

description

The description/variable label string of the item/variable.

annotation

code to recreate the annotation attribute,

labels

code to recreate the value labels,

value.filter

code to recreate the value filter attribute (declaration of missing values, range of valid values, or an enumeration of valid values.)

mode

a character string that describes storage mode, such as "character", "integer", or "numeric".

measurement

a character string with the measurement level, "nominal", "ordinal", "interval", or "ratio".

Examples

Data1 <- data.set(
         vote = sample(c(1,2,3,8,9,97,99),size=300,replace=TRUE),
         region = sample(c(rep(1,3),rep(2,2),3,99),size=300,replace=TRUE),
         income = exp(rnorm(300,sd=.7))*2000
         )
Data1 <- within(Data1,{
 description(vote) <- "Vote intention"
 description(region) <- "Region of residence"
 description(income) <- "Household income"
 foreach(x=c(vote,region),{
   measurement(x) <- "nominal"
   })
 measurement(income) <- "ratio"
 labels(vote) <- c(
                   Conservatives         =  1,
                   Labour                =  2,
                   "Liberal Democrats"   =  3,
                   "Don't know"          =  8,
                   "Answer refused"      =  9,
                   "Not applicable"      = 97,
                   "Not asked in survey" = 99)
 labels(region) <- c(
                   England               =  1,
                   Scotland              =  2,
                   Wales                 =  3,
                   "Not applicable"      = 97,
                   "Not asked in survey" = 99)
 foreach(x=c(vote,region,income),{
   annotation(x)["Remark"] <- "This is not a real survey item, of course ..."
   })
 missing.values(vote) <- c(8,9,97,99)
 missing.values(region) <- c(97,99)
})
cpData1 <- codeplan(Data1)
Data2 <- data.frame(
         vote = sample(c(1,2,3,8,9,97,99),size=300,replace=TRUE),
         region = sample(c(rep(1,3),rep(2,2),3,99),size=300,replace=TRUE),
         income = exp(rnorm(300,sd=.7))*2000
         )
codeplan(Data2) <- cpData1
codebook(Data2)
====================================================================================================

   vote 'Vote intention'

----------------------------------------------------------------------------------------------------

   Storage mode: double
   Measurement: nominal
   Missing values: 8, 9, 97, 99

   Values and labels              N Valid Total

    1   'Conservatives'          38  31.7  12.7
    2   'Labour'                 42  35.0  14.0
    3   'Liberal Democrats'      40  33.3  13.3
    8 M 'Don't know'             43        14.3
    9 M 'Answer refused'         50        16.7
   97 M 'Not applicable'         38        12.7
   99 M 'Not asked in survey'    49        16.3

   Remark:
       This is not a real survey item, of course ...

====================================================================================================

   region 'Region of residence'

----------------------------------------------------------------------------------------------------

   Storage mode: double
   Measurement: nominal
   Missing values: 97, 99

   Values and labels              N Valid Total

    1   'England'               123  45.7  41.0
    2   'Scotland'               98  36.4  32.7
    3   'Wales'                  48  17.8  16.0
   99 M 'Not asked in survey'    31        10.3

   Remark:
       This is not a real survey item, of course ...

====================================================================================================

   income 'Household income'

----------------------------------------------------------------------------------------------------

   Storage mode: double
   Measurement: ratio

        Min:   173.589
        Max: 12285.794
       Mean:  2476.932
   Std.Dev.:  1847.742

   Remark:
       This is not a real survey item, of course ...
# Note the difference between 'as.data.frame' and setting
# the codeplan to NULL:
Data2df <- as.data.frame(Data2)
codeplan(Data2) <- NULL
str(Data2)
'data.frame':        300 obs. of  3 variables:
 $ vote  : num  2 97 99 3 99 1 97 2 97 9 ...
 $ region: num  1 99 2 1 1 2 3 2 2 1 ...
 $ income: num  1030 881 2079 1131 748 ...
str(Data2df)
'data.frame':        300 obs. of  3 variables:
 $ vote  : Factor w/ 3 levels "Conservatives",..: 2 NA NA 3 NA 1 NA 2 NA NA ...
  ..- attr(*, "label")= chr "Vote intention"
 $ region: Factor w/ 3 levels "England","Scotland",..: 1 NA 2 1 1 2 3 2 2 1 ...
  ..- attr(*, "label")= chr "Region of residence"
 $ income: num  1030 881 2079 1131 748 ...
# Codeplans of survey items can also be inquired and manipulated:
vote <- Data1$vote
str(vote)
Nmnl. item w/ 7 labels for 1,2,3,... + ms.v.  num [1:300] 1 3 8 99 8 3 9 9 99
  99 ...
cp.vote <- codeplan(vote)
codeplan(vote) <- NULL
str(vote)
num [1:300] 1 3 8 99 8 3 9 9 99 99 ...
codeplan(vote) <- cp.vote
vote
Item 'Vote intention' (measurement: nominal, type: double, length = 300)

[1:300] Conservatives Liberal Democrats *Don't know *Not asked in survey *Don't
  know ...