=========================================== Recode Items, Factors and Numeric Vectors =========================================== .. r-package:: memisc .. r-pkgversion:: 0.99.22 .. r-name:: recode recode recode,vector-method recode,item-method recode,factor-method Description =========== ``recode`` substitutes old values of a factor or a numeric vector by new ones, just like the recoding facilities in some commercial statistical packages. Usage ===== .. code-block:: r recode(x,..., copy=getOption("recode_copy",identical(otherwise,"copy")), otherwise=NA) ## S4 method for signature 'vector' recode(x,..., copy=getOption("recode_copy",identical(otherwise,"copy")), otherwise=NA) ## S4 method for signature 'factor' recode(x,..., copy=getOption("recode_copy",identical(otherwise,"copy")), otherwise=NA) ## S4 method for signature 'item' recode(x,..., copy=getOption("recode_copy",identical(otherwise,"copy")), otherwise=NA) Arguments ========= ``x`` An object ``...`` One or more assignment expressions, each of the form ``new.value <- old.values``. ``new.value`` should be a scalar numeric value or character string. If one of the ``new.value``s is a character string, the return value of ``recode`` will be a factor and each ``new.value`` will be coerced to a character string that labels a level of the factor. Each ``old.value`` in an assignment expression may be a (numeric or character) vector. If ``x`` is numeric such an assignment expression may have the form ``new.value <- range(lower,upper)`` In that case, values between ``lower`` and ``upper`` are exchanged by ``new.value``. If one of the arguments to ``range`` is ``min``, it is substituted by the minimum of ``x``. If one of the arguments to ``range`` is ``max``, it is substituted by the maximum of ``x``. In case of the method for ``labelled`` vectors, the *tags* of arguments of the form ``tag = new.value <- old.values`` will define the labels of the new codes. If the ``old.values`` of different assignment expressions overlap, an error will be raised because the recoding is ambigous. ``copy`` logical; should those values of ``x`` not given an explicit new code copied into the resulting vector? ``otherwise`` a character string or some other value that the result may obtain. If equal to ``NA`` or ``"NA"``, original codes not given an explicit new code are recoded into ``NA``. If equal to ``"copy"``, original codes not given an explicit new code are copied. Value ===== A numerical vector, factor or an ``item`` object. Details ======= ``recode`` relies on the lazy evaluation mechanism of *R*: Arguments are not evaluated until required by the function they are given to. ``recode`` does not cause arguments that appear in ``...`` to be evaluated. Instead, ``recode`` parses the ``...`` arguments. Therefore, although expressions like ``1 <- 1:4`` would cause an error action, if evaluated at any place elsewhere in *R*, they will not cause an error action, if given to ``recode`` as an argument. However, a call of the form ``recode(x,1=1:4)``, would be a syntax error. If John Fox' package "car" is installed, ``recode`` will also be callable with the syntax of the ``recode`` function of that package. See also ======== ``recode`` of package "car". Examples ======== .. code-block:: r x <- as.item(sample(1:6,20,replace=TRUE), labels=c( a=1, b=2, c=3, d=4, e=5, f=6)) print(x) :: [1] b e c d a f a b b b b d e e b b b e f d .. code-block:: r # A recoded version of x is returned # containing the values 1, 2, 3, which are # labelled as "A", "B", "C". recode(x, A = 1 <- range(min,2), B = 2 <- 3:4, C = 3 <- range(5,max), # this last comma is ignored ) :: Item (measurement: nominal, type: integer, length = 20) [1:20] A C B B A C A A A A A B C C A A A C C B .. code-block:: r # This causes an error action: the sets # of original values overlap. try(recode(x, A = 1 <- range(min,2), B = 2 <- 2:4, C = 3 <- range(5,max) )) :: Error in recode(x, A = 1 <- range(min, 2), B = 2 <- 2:4, C = 3 <- range(5, : recoding request is ambiguous .. code-block:: r recode(x, A = 1 <- range(min,2), B = 2 <- 3:4, C = 3 <- range(5,6), D = 4 <- 7 ) :: Warning in recode(x, A = 1 <- range(min, 2), B = 2 <- 3:4, C = 3 <- range(5, : recoding 4 <- 7 has no consequences Item (measurement: nominal, type: integer, length = 20) [1:20] A C B B A C A A A A A B C C A A A C C B .. code-block:: r # This results in an all-missing vector: recode(x, D = 4 <- 7, E = 5 <- 8 ) :: Warning in recode(x, D = 4 <- 7, E = 5 <- 8) : recodings 4 <- 7, 5 <- 8 have no consequences Item (measurement: nominal, type: integer, length = 20) [1:20] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA .. code-block:: r f <- as.factor(x) x <- as.integer(x) recode(x, 1 <- range(min,2), 2 <- 3:4, 3 <- range(5,max) ) :: [1] 1 3 2 2 1 3 1 1 1 1 1 2 3 3 1 1 1 3 3 2 .. code-block:: r # This causes another error action: # the third argument is an invalid # expression for a recoding. try(recode(x, 1 <- range(min,2), 3:4, 3 <- range(5,max) )) :: Error in recode(x, 1 <- range(min, 2), 3:4, 3 <- range(5, max)) : invalid recoding request .. code-block:: r # The new values are character strings, # therefore a factor is returned. recode(x, "a" <- range(min,2), "b" <- 3:4, "c" <- range(5,6) ) :: [1] a c b b a c a a a a a b c c a a a c c b Levels: a b c .. code-block:: r recode(x, 1 <- 1:3, 2 <- 4:6 ) :: [1] 1 2 1 2 1 2 1 1 1 1 1 2 2 2 1 1 1 2 2 2 .. code-block:: r recode(x, 4 <- 7, 5 <- 8, otherwise = "copy" ) :: Warning in recode(x, 4 <- 7, 5 <- 8, otherwise = "copy") : recodings 4 <- 7, 5 <- 8 have no consequences [1] 2 5 3 4 1 6 1 2 2 2 2 4 5 5 2 2 2 5 6 4 .. code-block:: r recode(f, "A" <- c("a","b"), "B" <- c("c","d"), otherwise="copy" ) :: [1] A e B B A f A A A A A B e e A A A e f B Levels: A B e f .. code-block:: r recode(f, "A" <- c("a","b"), "B" <- c("c","d"), otherwise="C" ) :: [1] A C B B A C A A A A A B C C A A A C C B Levels: A B C .. code-block:: r recode(f, "A" <- c("a","b"), "B" <- c("c","d") ) :: [1] A B B A A A A A A B A A A [20] B Levels: A B .. code-block:: r DS <- data.set(x=as.item(sample(1:6,20,replace=TRUE), labels=c( a=1, b=2, c=3, d=4, e=5, f=6))) print(DS) :: x 1 b 2 c 3 b 4 d 5 d 6 c 7 f 8 c 9 d 10 a 11 f 12 e 13 d 14 f 15 b 16 a 17 f 18 d 19 e 20 f .. code-block:: r DS <- within(DS,{ xf <- recode(x, "a" <- range(min,2), "b" <- 3:4, "c" <- range(5,6) ) xn <- x@.Data xc <- recode(xn, "a" <- range(min,2), "b" <- 3:4, "c" <- range(5,6) ) xc <- as.character(x) xcc <- recode(xc, 1 <- letters[1:2], 2 <- letters[3:4], 3 <- letters[5:6] ) }) DS :: Data set with 20 observations and 5 variables x xf xn xc xcc 1 b a 2 b 1 2 c b 3 c 2 3 b a 2 b 1 4 d b 4 d 2 5 d b 4 d 2 6 c b 3 c 2 7 f c 6 f 3 8 c b 3 c 2 9 d b 4 d 2 10 a a 1 a 1 11 f c 6 f 3 12 e c 5 e 3 13 d b 4 d 2 14 f c 6 f 3 15 b a 2 b 1 16 a a 1 a 1 17 f c 6 f 3 18 d b 4 d 2 19 e c 5 e 3 20 f c 6 f 3 .. code-block:: r DS <- within(DS,{ xf <- recode(x, "a" <- range(min,2), "b" <- 3:4, "c" <- range(5,6) ) x1 <- recode(x, 1 <- range(1,2), 2 <- range(3,4), copy=TRUE ) xf1 <- recode(x, "A" <- range(1,2), "B" <- range(3,4), copy=TRUE ) }) DS :: Data set with 20 observations and 7 variables x xf xn xc xcc x1 xf1 1 b a 2 b 1 1 A 2 c b 3 c 2 2 B 3 b a 2 b 1 1 A 4 d b 4 d 2 2 B 5 d b 4 d 2 2 B 6 c b 3 c 2 2 B 7 f c 6 f 3 6 6 8 c b 3 c 2 2 B 9 d b 4 d 2 2 B 10 a a 1 a 1 1 A 11 f c 6 f 3 6 6 12 e c 5 e 3 5 5 13 d b 4 d 2 2 B 14 f c 6 f 3 6 6 15 b a 2 b 1 1 A 16 a a 1 a 1 1 A 17 f c 6 f 3 6 6 18 d b 4 d 2 2 B 19 e c 5 e 3 5 5 20 f c 6 f 3 6 6 .. code-block:: r codebook(DS) :: ==================================================================================================== x ---------------------------------------------------------------------------------------------------- Storage mode: integer Measurement: nominal Values and labels N Percent 1 'a' 2 10.0 2 'b' 3 15.0 3 'c' 3 15.0 4 'd' 5 25.0 5 'e' 2 10.0 6 'f' 5 25.0 ==================================================================================================== xf ---------------------------------------------------------------------------------------------------- Storage mode: integer Measurement: nominal Values and labels N Percent 1 'a' 5 25.0 2 'b' 8 40.0 3 'c' 7 35.0 ==================================================================================================== xn ---------------------------------------------------------------------------------------------------- Storage mode: integer Measurement: interval Min: 1.000 Max: 6.000 Mean: 3.850 Std.Dev.: 1.652 Skewness: -0.160 Kurtosis: -1.125 ==================================================================================================== xc ---------------------------------------------------------------------------------------------------- Storage mode: character Measurement: nominal Min: a Max: f ==================================================================================================== xcc ---------------------------------------------------------------------------------------------------- Storage mode: integer Measurement: nominal Values and labels N Percent 1 '1' 5 25.0 2 '2' 8 40.0 3 '3' 7 35.0 ==================================================================================================== x1 ---------------------------------------------------------------------------------------------------- Storage mode: integer Measurement: nominal Values N Percent (unlab.val.) 20 100.0 ==================================================================================================== xf1 ---------------------------------------------------------------------------------------------------- Storage mode: integer Measurement: nominal Values and labels N Percent 1 'A' 5 25.0 2 'B' 8 40.0 (unlab.val.) 7 35.0 .. code-block:: r DF <- data.frame(x=rep(1:6,4,replace=TRUE)) DF <- within(DF,{ xf <- recode(x, "a" <- range(min,2), "b" <- 3:4, "c" <- range(5,6) ) x1 <- recode(x, 1 <- range(1,2), 2 <- range(3,4), copy=TRUE ) xf1 <- recode(x, "A" <- range(1,2), "B" <- range(3,4), copy=TRUE ) xf2 <- recode(x, "B" <- range(3,4), "A" <- range(1,2), copy=TRUE ) }) DF :: x xf2 xf1 x1 xf 1 1 A A 1 a 2 2 A A 1 a 3 3 B B 2 b 4 4 B B 2 b 5 5 5 5 5 c 6 6 6 6 6 c 7 1 A A 1 a 8 2 A A 1 a 9 3 B B 2 b 10 4 B B 2 b 11 5 5 5 5 c 12 6 6 6 6 c 13 1 A A 1 a 14 2 A A 1 a 15 3 B B 2 b 16 4 B B 2 b 17 5 5 5 5 c 18 6 6 6 6 c 19 1 A A 1 a 20 2 A A 1 a 21 3 B B 2 b 22 4 B B 2 b 23 5 5 5 5 c 24 6 6 6 6 c .. code-block:: r codebook(DF) :: ==================================================================================================== x ---------------------------------------------------------------------------------------------------- Storage mode: integer Min.: 1.000 1st Qu.: 2.000 Median: 3.500 Mean: 3.500 3rd Qu.: 5.000 Max.: 6.000 ==================================================================================================== xf2 ---------------------------------------------------------------------------------------------------- Storage mode: integer Factor with 4 levels Levels and labels N Percent 1 'B' 8 33.3 2 'A' 8 33.3 3 '5' 4 16.7 4 '6' 4 16.7 ==================================================================================================== xf1 ---------------------------------------------------------------------------------------------------- Storage mode: integer Factor with 4 levels Levels and labels N Percent 1 'A' 8 33.3 2 'B' 8 33.3 3 '5' 4 16.7 4 '6' 4 16.7 ==================================================================================================== x1 ---------------------------------------------------------------------------------------------------- Storage mode: double Min.: 1.000 1st Qu.: 1.000 Median: 2.000 Mean: 2.833 3rd Qu.: 5.000 Max.: 6.000 ==================================================================================================== xf ---------------------------------------------------------------------------------------------------- Storage mode: integer Factor with 3 levels Levels and labels N Percent 1 'a' 8 33.3 2 'b' 8 33.3 3 'c' 8 33.3