Saturday, January 26, 2013

R drop dimension caused bug

I was writing a R function to merge two dataframes.  It creates a named empty data.frame with the same names as another one.  It works like this:
> a <- data.frame(a=1,b=2)
#create a named empty data.frame
> b <- a[0,]

> b
[1] a b
<0 rows> (or 0-length row.names)


After above code, I have an empty data.frame called "b" with the same names as "a".  Everything works fine until I have the following situation:
> a <- data.frame(a=1)
> a[0,]

numeric(0)

What?? It turns out that when I have a data.frame with one column and try to select no row as above,  R drops dimension and give me a numeric(0)!   To make it work, just tell R not to drop dimension:

> a[0,,drop=FALSE]
[1] a
<0 rows> (or 0-length row.names)


This "feature" is rather annoying and it trapped me before when I was doing matrix calculation.

I googled around and found other people's opinion.

http://radfordneal.wordpress.com/2008/08/20/design-flaws-in-r-2-%E2%80%94-dropped-dimensions/

Unfortunately, there is no option to default "drop=FALSE" but someone has a clever solution:
http://stackoverflow.com/questions/12196724/generally-disable-dimension-dropping-for-matrices

Of course, overwriting default function is not standard way and better be restored later.

Love it or hate it, I have to live with it.

No comments:

Post a Comment