Sunday, March 31, 2013

R and the last comma

In R, every comma matters. When creating a vector, c(1, 2, 5) will do the right thing, but add one unfortunate comma and c(1, 2, 5,) will greet you with a deadly Error in c(1, 2, 5, ) : argument 4 is empty.

Other languages like Perl are less strict when defining basic data structures: having a comma after the last item is allowed. This can be particularly useful when items are specified on multiple lines as in this example:

my @cities = (
  "New York",
  "Washington",
  "Atlanta",
)

Because the last line item is syntactically no different than any other, I can comment, uncomment, add, remove, swap items anywhere within the definition without having to worry about which lines should have a comma or not (they all should.)

Can R's default behavior be overridden? Yes, using the following functional:

The functional acts as a wrapper around any function: it creates an identical function but such that the last argument, if missing, is thrown out.

This way, I can call functions like c, list, data.frame indirectly and with an optional extra comma at the end:

cities <- ok.comma(c)(
  "New York",
  "Washington",
  "Atlanta",
)

I can even put the definition of ok.comma into my .Rprofile file and redefine functions

c          <- ok.comma(base::c)
list       <- ok.comma(base::list)
data.frame <- ok.comma(base::data.frame)

so I can seamlessly do:

cities <- c(
  "New York",
  "Washington",
  "Atlanta",
)

I hope you find this useful.

Sunday, January 6, 2013

Search and replace: Are you tired of nested `ifelse`?

It happens all the time: you have a vector of fruits and you want to replace all bananas with apples, all oranges with pineapples, and leave all the other fruits as-is, or maybe change them all to figs. The usual solution? A big old nested `ifelse`:

Ok, that didn't look too bad, especially with the code and fruits nicely aligned. But what if I had a lot of fruits to change and little patience? Wouldn't it be nice if R had a built-in function for doing multiple search and replace? Someone please tell me if there is already such a function. If not, here is one I wrote that builds a nested `ifelse` function by recursion:

Note that I named the function after the `decode` SQL function. Here are a couple examples:

Feel free to use it with your favorite fruits or vegetables! Cheers!

P.S.: I wrote this function as an answer to this S.O. question. Thank you to Matthew Lundberg for sharing ideas.

Photo source: http://www.istockphoto.com/stock-photo-19534475-mixed-fruit.php