Recode Items, Factors and Numeric Vectors¶
Description¶
recode
substitutes old values of a factor or a numeric vector by new ones, just like
the recoding facilities in some commercial statistical packages.
Usage¶
recode(x,...,
copy=getOption("recode_copy",identical(otherwise,"copy")),
otherwise=NA)
## S4 method for signature 'vector'
recode(x,...,
copy=getOption("recode_copy",identical(otherwise,"copy")),
otherwise=NA)
## S4 method for signature 'factor'
recode(x,...,
copy=getOption("recode_copy",identical(otherwise,"copy")),
otherwise=NA)
## S4 method for signature 'item'
recode(x,...,
copy=getOption("recode_copy",identical(otherwise,"copy")),
otherwise=NA)
Arguments¶
x
-
An object
...
-
One or more assignment expressions, each of the form
new.value <- old.values
.new.value
should be a scalar numeric value or character string. If one of thenew.value``s is a character string, the return value of ``recode
will be a factor and eachnew.value
will be coerced to a character string that labels a level of the factor. Eachold.value
in an assignment expression may be a (numeric or character) vector. Ifx
is numeric such an assignment expression may have the formnew.value <- range(lower,upper)
In that case, values betweenlower
andupper
are exchanged bynew.value
. If one of the arguments torange
ismin
, it is substituted by the minimum ofx
. If one of the arguments torange
ismax
, it is substituted by the maximum ofx
. In case of the method forlabelled
vectors, the tags of arguments of the formtag = new.value <- old.values
will define the labels of the new codes. If theold.values
of different assignment expressions overlap, an error will be raised because the recoding is ambigous. copy
-
logical; should those values of
x
not given an explicit new code copied into the resulting vector? otherwise
-
a character string or some other value that the result may obtain. If equal to
NA
or"NA"
, original codes not given an explicit new code are recoded intoNA
. If equal to"copy"
, original codes not given an explicit new code are copied.
Value¶
A numerical vector, factor or an item
object.
Details¶
recode
relies on the lazy evaluation mechanism of R: Arguments are not evaluated
until required by the function they are given to. recode
does not cause arguments
that appear in ...
to be evaluated. Instead, recode
parses the ...
arguments.
Therefore, although expressions like 1 <- 1:4
would cause an error action, if
evaluated at any place elsewhere in R, they will not cause an error action, if given to
recode
as an argument. However, a call of the form recode(x,1=1:4)
, would be a
syntax error.
If John Fox’ package “car” is installed, recode
will also be callable with the syntax
of the recode
function of that package.
See also¶
recode
of package “car”.
Examples¶
x <- as.item(sample(1:6,20,replace=TRUE),
labels=c( a=1,
b=2,
c=3,
d=4,
e=5,
f=6))
print(x)
[1] a f e b c d a b b f a e b d c c d f c a
# A recoded version of x is returned
# containing the values 1, 2, 3, which are
# labelled as "A", "B", "C".
recode(x,
A = 1 <- range(min,2),
B = 2 <- 3:4,
C = 3 <- range(5,max), # this last comma is ignored
)
Item (measurement: nominal, type: integer, length = 20)
[1:20] A C C A B B A A A C A C A B B B B C B A
# This causes an error action: the sets
# of original values overlap.
try(recode(x,
A = 1 <- range(min,2),
B = 2 <- 2:4,
C = 3 <- range(5,max)
))
Error in recode(x, A = 1 <- range(min, 2), B = 2 <- 2:4, C = 3 <- range(5, :
recoding request is ambiguous
recode(x,
A = 1 <- range(min,2),
B = 2 <- 3:4,
C = 3 <- range(5,6),
D = 4 <- 7
)
Warning in recode(x, A = 1 <- range(min, 2), B = 2 <- 3:4, C = 3 <- range(5, :
recoding 4 <- 7 has no consequences
Item (measurement: nominal, type: integer, length = 20)
[1:20] A C C A B B A A A C A C A B B B B C B A
# This results in an all-missing vector:
recode(x,
D = 4 <- 7,
E = 5 <- 8
)
Warning in recode(x, D = 4 <- 7, E = 5 <- 8) :
recodings 4 <- 7, 5 <- 8 have no consequences
Item (measurement: nominal, type: integer, length = 20)
[1:20] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
f <- as.factor(x)
x <- as.integer(x)
recode(x,
1 <- range(min,2),
2 <- 3:4,
3 <- range(5,max)
)
[1] 1 3 3 1 2 2 1 1 1 3 1 3 1 2 2 2 2 3 2 1
# This causes another error action:
# the third argument is an invalid
# expression for a recoding.
try(recode(x,
1 <- range(min,2),
3:4,
3 <- range(5,max)
))
Error in recode(x, 1 <- range(min, 2), 3:4, 3 <- range(5, max)) :
invalid recoding request
# The new values are character strings,
# therefore a factor is returned.
recode(x,
"a" <- range(min,2),
"b" <- 3:4,
"c" <- range(5,6)
)
[1] a c c a b b a a a c a c a b b b b c b a
Levels: a b c
recode(x,
1 <- 1:3,
2 <- 4:6
)
[1] 1 2 2 1 1 2 1 1 1 2 1 2 1 2 1 1 2 2 1 1
recode(x,
4 <- 7,
5 <- 8,
otherwise = "copy"
)
Warning in recode(x, 4 <- 7, 5 <- 8, otherwise = "copy") :
recodings 4 <- 7, 5 <- 8 have no consequences
[1] 1 6 5 2 3 4 1 2 2 6 1 5 2 4 3 3 4 6 3 1
recode(f,
"A" <- c("a","b"),
"B" <- c("c","d"),
otherwise="copy"
)
[1] A f e A B B A A A f A e A B B B B f B A
Levels: A B e f
recode(f,
"A" <- c("a","b"),
"B" <- c("c","d"),
otherwise="C"
)
[1] A C C A B B A A A C A C A B B B B C B A
Levels: A B C
recode(f,
"A" <- c("a","b"),
"B" <- c("c","d")
)
[1] A <NA> <NA> A B B A A A <NA> A <NA> A B B B B <NA> B
[20] A
Levels: A B
DS <- data.set(x=as.item(sample(1:6,20,replace=TRUE),
labels=c( a=1,
b=2,
c=3,
d=4,
e=5,
f=6)))
print(DS)
x
1 a
2 d
3 a
4 e
5 f
6 c
7 d
8 a
9 e
10 a
11 d
12 f
13 c
14 d
15 c
16 d
17 c
18 c
19 c
20 b
DS <- within(DS,{
xf <- recode(x,
"a" <- range(min,2),
"b" <- 3:4,
"c" <- range(5,6)
)
xn <- x@.Data
xc <- recode(xn,
"a" <- range(min,2),
"b" <- 3:4,
"c" <- range(5,6)
)
xc <- as.character(x)
xcc <- recode(xc,
1 <- letters[1:2],
2 <- letters[3:4],
3 <- letters[5:6]
)
})
DS
Data set with 20 observations and 5 variables
x xf xn xc xcc
1 a a 1 a 1
2 d b 4 d 2
3 a a 1 a 1
4 e c 5 e 3
5 f c 6 f 3
6 c b 3 c 2
7 d b 4 d 2
8 a a 1 a 1
9 e c 5 e 3
10 a a 1 a 1
11 d b 4 d 2
12 f c 6 f 3
13 c b 3 c 2
14 d b 4 d 2
15 c b 3 c 2
16 d b 4 d 2
17 c b 3 c 2
18 c b 3 c 2
19 c b 3 c 2
20 b a 2 b 1
DS <- within(DS,{
xf <- recode(x,
"a" <- range(min,2),
"b" <- 3:4,
"c" <- range(5,6)
)
x1 <- recode(x,
1 <- range(1,2),
2 <- range(3,4),
copy=TRUE
)
xf1 <- recode(x,
"A" <- range(1,2),
"B" <- range(3,4),
copy=TRUE
)
})
DS
Data set with 20 observations and 7 variables
x xf xn xc xcc x1 xf1
1 a a 1 a 1 1 A
2 d b 4 d 2 2 B
3 a a 1 a 1 1 A
4 e c 5 e 3 5 5
5 f c 6 f 3 6 6
6 c b 3 c 2 2 B
7 d b 4 d 2 2 B
8 a a 1 a 1 1 A
9 e c 5 e 3 5 5
10 a a 1 a 1 1 A
11 d b 4 d 2 2 B
12 f c 6 f 3 6 6
13 c b 3 c 2 2 B
14 d b 4 d 2 2 B
15 c b 3 c 2 2 B
16 d b 4 d 2 2 B
17 c b 3 c 2 2 B
18 c b 3 c 2 2 B
19 c b 3 c 2 2 B
20 b a 2 b 1 1 A
codebook(DS)
====================================================================================================
x
----------------------------------------------------------------------------------------------------
Storage mode: integer
Measurement: nominal
Values and labels N Percent
1 'a' 4 20.0
2 'b' 1 5.0
3 'c' 6 30.0
4 'd' 5 25.0
5 'e' 2 10.0
6 'f' 2 10.0
====================================================================================================
xf
----------------------------------------------------------------------------------------------------
Storage mode: integer
Measurement: nominal
Values and labels N Percent
1 'a' 5 25.0
2 'b' 11 55.0
3 'c' 4 20.0
====================================================================================================
xn
----------------------------------------------------------------------------------------------------
Storage mode: integer
Measurement: interval
Min: 1.000
Max: 6.000
Mean: 3.300
Std.Dev.: 1.520
====================================================================================================
xc
----------------------------------------------------------------------------------------------------
Storage mode: character
Measurement: nominal
Min: "a"
Max: "f"
====================================================================================================
xcc
----------------------------------------------------------------------------------------------------
Storage mode: integer
Measurement: nominal
Values and labels N Percent
1 '1' 5 25.0
2 '2' 11 55.0
3 '3' 4 20.0
====================================================================================================
x1
----------------------------------------------------------------------------------------------------
Storage mode: integer
Measurement: nominal
Values N Percent
(unlab.val.) 20 100.0
====================================================================================================
xf1
----------------------------------------------------------------------------------------------------
Storage mode: integer
Measurement: nominal
Values and labels N Percent
1 'A' 5 25.0
2 'B' 11 55.0
(unlab.val.) 4 20.0
DF <- data.frame(x=rep(1:6,4,replace=TRUE))
DF <- within(DF,{
xf <- recode(x,
"a" <- range(min,2),
"b" <- 3:4,
"c" <- range(5,6)
)
x1 <- recode(x,
1 <- range(1,2),
2 <- range(3,4),
copy=TRUE
)
xf1 <- recode(x,
"A" <- range(1,2),
"B" <- range(3,4),
copy=TRUE
)
xf2 <- recode(x,
"B" <- range(3,4),
"A" <- range(1,2),
copy=TRUE
)
})
DF
x xf2 xf1 x1 xf
1 1 A A 1 a
2 2 A A 1 a
3 3 B B 2 b
4 4 B B 2 b
5 5 5 5 5 c
6 6 6 6 6 c
7 1 A A 1 a
8 2 A A 1 a
9 3 B B 2 b
10 4 B B 2 b
11 5 5 5 5 c
12 6 6 6 6 c
13 1 A A 1 a
14 2 A A 1 a
15 3 B B 2 b
16 4 B B 2 b
17 5 5 5 5 c
18 6 6 6 6 c
19 1 A A 1 a
20 2 A A 1 a
21 3 B B 2 b
22 4 B B 2 b
23 5 5 5 5 c
24 6 6 6 6 c
codebook(DF)
====================================================================================================
x
----------------------------------------------------------------------------------------------------
Storage mode: integer
Min: 1.000
Max: 6.000
Mean: 3.500
Std.Dev.: 1.708
Skewness: 0.000
Kurtosis: -1.269
====================================================================================================
xf2
----------------------------------------------------------------------------------------------------
Storage mode: integer
Factor with 4 levels
Levels and labels N Valid
1 'B' 8 33.3
2 'A' 8 33.3
3 '5' 4 16.7
4 '6' 4 16.7
====================================================================================================
xf1
----------------------------------------------------------------------------------------------------
Storage mode: integer
Factor with 4 levels
Levels and labels N Valid
1 'A' 8 33.3
2 'B' 8 33.3
3 '5' 4 16.7
4 '6' 4 16.7
====================================================================================================
x1
----------------------------------------------------------------------------------------------------
Storage mode: double
Min: 1.000
Max: 6.000
Mean: 2.833
Std.Dev.: 1.951
Skewness: 0.639
Kurtosis: -1.318
====================================================================================================
xf
----------------------------------------------------------------------------------------------------
Storage mode: integer
Factor with 3 levels
Levels and labels N Valid
1 'a' 8 33.3
2 'b' 8 33.3
3 'c' 8 33.3