Thursday, May 17, 2012

Emulating local static variables in R

Recently I was writing a code allowing to plot multiple ggplot2 plots on one page. I wanted to replicate standard behavior of  plot  function that plots graphs in sequence according to  mfrow/ mfcol option in par. The solution lead me to think of emulating C-like local static variables in R.
There are several solutions to this problem but I think that a nice one is by adding attributes to a function. Here is a simple example:

f <- function(x) {
    y <- attr(f, "sum")
    if (is.null(y)) {
        y <- 0
    }
    y <- x + y
    attr(f, "sum") <<- y
    return(y)
}

It can be applied as follows:

> for (in 1:5) cat(i, ": ", f(i)"\n", sep="")
1: 1
2: 3
3: 6
4: 10
5: 15

As it can be seen attribute "sum" is static but it can be thought of as local because it is not stored directly as a variable in global environment.

And here is the application of the concept to the problem of plotting several qplots in a sequence:

library(ggplot2)
library(grid)

# setup the ploting grid and plotting sequence
mplot.setup <- function(nrow, ncol, by.row = TRUE) {
    attributes(mplot.seq) <<- list(nrow = nrow, ncol = ncol,
      pos = 0, by.row = by.row)
    grid.newpage()
    pushViewport(viewport(layout = grid.layout(nrow, ncol)))
}  

# plot at given grid location
mplot <- function(graph, row, col) {
     print(graph, vp = viewport(layout.pos.row = row,
                                layout.pos.col = col))
}

# plot the at the next position in the sequence
mplot.seq <- function(graph) {
    pos <- attr(mplot.seq, "pos")
    nrow <- attr(mplot.seq, "nrow")
    ncol <- attr(mplot.seq, "ncol")

    if (attr(mplot.seq, "by.row")) {
        col <- 1 + (pos %% ncol)
        row <- 1 + ((pos %/% ncol) %% nrow)
    } else {
        row <- 1 + (pos %% nrow)
        col <- 1 + ((pos %/% nrow) %% ncol)
    }
    attr(mplot.seq, "pos") <<- pos + 1
    mplot(graph, row, col)
}

# application example
mplot.setup(2,4, FALSE)
for (i in 1:4) {
  mplot.seq(qplot(iris[,i], xlab = names(iris)[i]))
  mplot.seq(qplot(iris[,5], iris[,i], geom = "boxplot",
    xlab = "Species", ylab = names(iris)[i]) + coord_flip())
}

The following plot is produced by the above code:


1 comment:

  1. I was looking for something like a singleton pattern in R to keep objects that are expensive to load and this is the best approach I found so far. Thanks for that!
    Previously I was playing with globals but this is messy and lintr was complaining about it.

    This is how I used it to load the biomaRt object.

    load_biomart <- function() {
    ensembl <- attr(load_biomart, "cached_ensembl")
    if (is.null(ensembl)) {
    ## Loads ensembl biomart for Homo sapiens ensembl = useMart('ensembl',dataset='hsapiens_gene_ensembl')
    futile.logger::flog.info("Loading biomart for the first time")
    ensembl_mart <- biomaRt::useMart("ENSEMBL_MART_ENSEMBL", host = "www.ensembl.org")
    dataset <- "hsapiens_gene_ensembl"
    ensembl <- biomaRt::useDataset(dataset, mart = ensembl_mart)
    attr(load_biomart, "cached_ensembl") <<- ensembl
    } else {
    futile.logger::flog.info("Returning cached biomart")
    }
    ensembl
    }

    ReplyDelete

Note: Only a member of this blog may post a comment.