Hijacking R Functions: Changing Default Arguments

I am working on a package to collect common regular expressions into a canned collection that users can easily use without having to know regexes. The package, qdapRegex, has a bunch of functions in the form of rm_xxx. The only difference between each function is one default parameter, the regular expression pattern is different. I had a default template function so what I really needed was to copy that template many times and change one parameter. It seems wasteful of code and electronic space to cut and paste the body of the template function over and over again…I needed to hijack the template.

Come on admit it you’ve all wished you could hijack a function before. Who hasn’t wished the default to data.frame was stringsAsFactors = FALSE? Or sum was na.rm = TRUE (OK maybe the latter is just me). So for the task of efficiently hijacking a function and changing the defaults in a manageable modular way my mind immediately went to Hadley’s pryr package (Wickham (2014)). I remember him hijacking functions in his Advanced R book as seen HERE with the partial function.

It worked except I couldn’t then change the newly set defaults back. In my case for package writing this was not a good thing (maybe there was a way and I missed it).


A Function Worth Hijacking

Here’s an example where we attempt to hijack data.frame.

dat <- data.frame(x1 = 1:3, x2 = c("a", "b", "c"))
str(dat)  # yuck a string as a factor
## 'data.frame':    3 obs. of  2 variables:
##  $ x1: int  1 2 3
##  $ x2: Factor w/ 3 levels "a","b","c": 1 2 3

Typically we’d do something like:

.data.frame <- function(..., row.names = NULL, check.rows = FALSE, check.names = TRUE,
    stringsAsFactors = FALSE) {

    data.frame(..., row.names = row.names, check.rows = check.rows,
        check.names = check.names, stringsAsFactors = stringsAsFactors)

}

dat <- .data.frame(x1 = 1:3, x2 = c("a", "b", "c"))
str(dat)  # yay!  strings are character
## 'data.frame':    3 obs. of  2 variables:
##  $ x1: int  1 2 3
##  $ x2: chr  "a" "b" "c"

But for my qdapRegex needs this required a ton of cut and paste. That means lots of extra code in the .R files.


The First Attempt to Hijack a Function

pryr to the rescue

library(pryr)

## The hijack
.data.frame <- pryr::partial(data.frame, stringsAsFactors = FALSE)

dat <- .data.frame(x1 = 1:3, x2 = c("a", "b", "c"))
str(dat)  # yay! strings are character
## 'data.frame':    3 obs. of  2 variables:
##  $ x1: int  1 2 3
##  $ x2: chr  "a" "b" "c"

But I can’t change the default back…

.data.frame(x1 = 1:3, x2 = c("a", "b", "c"), stringsAsFactors = TRUE)
## Error: formal argument "stringsAsFactors" matched by multiple actual
## arguments

Hijacking In Style (formals)

Doomed…

After tinkering with many not so reasonable solutions I asked on stackoverflow.com. In a short time MrFlick responded most helpfully (as he often does) with a response that used formals to change the formals of a function. I should have thought of it myself as I’d seen its use in Advanced R as well.

Here I use the answer to make a hijack function. It does exactly what I want, take a function and reset its formal arguments as desired.

hijack <- function (FUN, ...) {
    .FUN <- FUN
    args <- list(...)
    invisible(lapply(seq_along(args), function(i) {
        formals(.FUN)[[names(args)[i]]] <<- args[[i]]
    }))
    .FUN
}

Let’s see it in action as it changes the defaults but allows the user to still set these arguments…

.data.frame <- hijack(data.frame, stringsAsFactors = FALSE)

dat <- .data.frame(x1 = 1:3, x2 = c("a", "b", "c"))
str(dat)  # yay! strings are character
## 'data.frame':    3 obs. of  2 variables:
##  $ x1: int  1 2 3
##  $ x2: chr  "a" "b" "c"
.data.frame(x1 = 1:3, x2 = c("a", "b", "c"), stringsAsFactors = TRUE)
##   x1 x2
## 1  1  a
## 2  2  b
## 3  3  c

Note that for some purposes Dason suggested an alternative solution that is similar to the first approach I describe above but requires less copying as it used ldots (ellipsis) to cover the parameters that we don’t want to change. This approach would look something like this:

.data.frame <- function(..., stringsAsFactors = FALSE) {

    data.frame(..., stringsAsFactors = stringsAsFactors)

}

dat <- .data.frame(x1 = 1:3, x2 = c("a", "b", "c"))
str(dat)  # yay!  strings are character
## 'data.frame':    3 obs. of  2 variables:
##  $ x1: int  1 2 3
##  $ x2: chr  "a" "b" "c"
.data.frame(x1 = 1:3, x2 = c("a", "b", "c"), stringsAsFactors = TRUE)
##   x1 x2
## 1  1  a
## 2  2  b
## 3  3  c

Less verbose than the first approach I had. This solution was not the best for me in that I wanted to document all of the arguments to the function for the package. I believe using this approach would limit me to the arguments …, stringsAsFactors in the documentation (though I didn’t try it with CRAN checks). Depending on the situation this approach may be ideal.

References


*Created using the reports package

Advertisements

About tylerrinker

I am Literacy PhD student with a bent for the quantitative and a passion for R.
This entry was posted in package creation, Uncategorized and tagged , , , , , , . Bookmark the permalink.

7 Responses to Hijacking R Functions: Changing Default Arguments

  1. G. Grothendieck says:

    Check out the Defaults package on CRAN.

  2. tylerrinker says:

    I saw another poster, @alexis_laz, mentioned this as well in the SO question.

  3. MadScone says:

    I much prefer Dason’s method myself. I’ve seen this kind of trickery before, and I find it makes things too complicated when they don’t really need to be. E.g. see the source code for write.csv(). Hadley Wickham talks about it here: http://adv-r.had.co.nz/Expressions.html. The result is a function body of ~13 lines of confusing code that I’d say nobody understands at first glance that could be written as one line that everyone will comprehend. But that my might just be my personal preference! I’m not exactly sure what you’re trying to do in your own package.

    I came up with another solution for the fun of it. It just merges a list of defaults with the arguments passed to the function. Any duplicates must be removed from the defaults list in case someone wants to overwrite a default.

    hijack <- function (f, …) {
    def <- list(…)
    function (…) {
    args <- c(def[!names(def) %in% names(list(…))], list(…))
    do.call(f, args)
    }
    }

    • tylerrinker says:

      @MadScone Thanks for your alternative. It is fun to play with the thinking. For me Dason’s approach is sometimes preferable, but with many things, it depends on the specific task at hand. For the use in the package I was writing the `hijack` approach was ideal as it’s for multiple functions. PS funny note on source for write.csv, never looked before. For the way I did it in hijack the source for a functions that are output will be identical to the template it started from with the exception of a single changed parameter. It makes it easy for the user to understand the source and for me to maintain it. This way if I update the template all the child functions will also be updated.

  4. What’s wrong with strings as factors?

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s