[R] Unexpected behavior of extract (`[`) or sapply functions

andrewH ahoerner at rprogress.org
Fri Oct 7 09:03:55 CEST 2011


Dear folks--
The function below is a snippet of a larger function that is not doing what
it is supposed to do, and I do not understand its behavior.  The larger
function is supposed to produce an array containing the results of a
user-specified function applied to groups of data defined by the
intersection of one or more factors, and return them in an array with a
dimension for each factor and a dimension level for each factor level.  This
snippet is supposed to take a data frame, a vector of column numbers
containing factors, and a column number for the data, and return (in the
test function below, just print) a list of character vectors of the level
names (one vector per dimension) and the length of those vectors.

It works fine so long as I give it more than one factor column, but if I
give it a vector of factor columns of length 1, it behave differently and
when I try to assign the names from test.levels to the dimnames of the
array,  I end up with an error message:

Error in dimnames(data) <- dimnames : 
  length of 'dimnames' [1] not equal to array extent

The example below shows the function output for a test data frame
(“test.df”) when run first of a vector of two column number for factors and
then on just one. You can see how the structure of the output shifts.  I can
not understand what is happening. What I want it to do when given just
factor cols =c(1)  is to give me back exactly what it gives me bact for
factor colum 1 in factor.cols = c(1,2).

Any help or suggestions would be greatly appreciated.

Sincerely, 
   andrewH

# Test Data
test.df <- data.frame(AA=rep(LETTERS[1:2], c(6,6)),BB=rep(LETTERS[3:5],
c(4,4,4)), 
                              CC=rep(LETTERS[6:9],c(3,3,3,3)), DD=c(1:12))

# The function
getLevels <- function(data.df, factor.cols, data.col){
test.levels <- sapply(test.df[,factor.cols, drop=F], levels)
cat("test.levels:\n"); print(test.levels)
no.levels <- sapply(sapply(data.df[,factor.cols, drop=F], levels), length)
cat("no.levels:\n"); print(no.levels)
} 

# Run it with two factors and again with 1, Output below
cat("\nTest 2 factors:\n")
getLevels(test.df, c(1,2), 4)
cat("\nTest 1 factor:\n")
getLevels(test.df, c(1), 4)

Test 2 factors:
> getLevels(test.df, c(1,2), 4)
test.levels=
$AA
[1] "A" "B"

$BB
[1] "C" "D" "E"

no.levels=AA BB 
 2  3 
> cat("\nTest 1 factor:\n")

Test 1 factor:
> getLevels(test.df, c(1), 4)
test.levels=     AA 
[1,] "A"
[2,] "B"
no.levels=A B 
1 1 


--
View this message in context: http://r.789695.n4.nabble.com/Unexpected-behavior-of-extract-or-sapply-functions-tp3881176p3881176.html
Sent from the R help mailing list archive at Nabble.com.



More information about the R-help mailing list