Thursday, July 26, 2012

Changing function scope in GNU R example

In my last post I have discussed how to work around GNU R scoping rules using environment function. This time let us look at a practical example using recode function from car package.
First let us look at how recode works:

library(car)
x <- rep(1:3, 2)
recode(x, "1='a'")

transforms 1 2 3 1 2 3 into "a" "2" "3" "a" "2" "3". In further codes we will want to replicate this result using several different approaches sticking to the same definition of variable x.
Interestingly the string recodes is split and evaluated inside recode function so we can use the following code to get the same result:

a <- 1
b <- "a"
recode(x, "a[1]=b[1]")

Now we can change b variable to get different recoding results without the change of recode call. Note that this use of recode is not following its help page as documentation not support using variables inside recodes string.
Let us now try writing a simple wrapper around recode that does the same stuff. Unfortunately:

wrong.recode.one <- function(v, from, to) {
    recode(v, "from[1]=to[1]")
}
wrong.recode.one(x, 1, "a")

does not work and produces error. Due to lexical used in GNU R from and to variables are not within recode function scope.
Here are two ways to work around it:

recode.one <- function(v, from, to) {
    environment(recode) <- environment()
    squeezeBlanks <- car:::squeezeBlanks
    recode(v, "from[1]=to[1]")
}
recode.one(x, 1, "a")

recode.one2 <- function(v, from, to) {
    environment(recode) <- environment()
    recode(v, "from[1]=to[1]")
}
environment(recode.one2) <- environment(recode)
recode.one2(x, 1, "a")

The first function recode.one moves recode into its own lexical scope. Unfortunately it also has to move squeezeBlanks function into its scope to work properly as recode calls it. To avoid this recode.one2 is put into environment(recode) environment so squeezeBlanks will be in its lexical scope.

Those examples helped me better understand how scoping works in GNU R, but they are dangerous. To see this one can look at the following code:

<- 1
is.fac <- "a"
recode(x,"a[1]=is.fac[1]")

Such call produces incorrect result 0 2 3 0 2 3 because is.fac is defined inside recode function so its evaluation is done inside recode body.

Another lesson that one has to be careful when hacking in GNU R.

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.