[R] Cannot grasp how to apply "by" here...

Jonas Malmros jonas.malmros at gmail.com
Mon Dec 17 19:47:46 CET 2007


I have a data frame named "database" with panel data, a little piece
of which looks like this:

  Symbol               Name             Trial        Factor1  Factor2
   External
1 548140                 A                  1            -3.87
-0.32         0.01
2 547400                 B                  1            12.11
-0.68         0.40
3 547173                 C                  1             4.50
0.71        -1.36
4 546832                 D                  1             2.59
0.00         0.09
5 548140                 A                  2             2.41
0.50        -1.04
6 547400                 B                  2             1.87
0.32         0.39

What I want to do is to calculate correlation between each factor and
external for each Symbol, and record the corr. estimate, the p.value,
the name and number of observations in a vector named "vector", then
rbind these vectors together in "results". When there are fewer than 5
observations for a particular symbol I want to put NAs in each column
of "vector".

I tried with the following code, making assumption that by splits
database into sort of smaller dataframes for each Symbol (that's the
"x"):

factor.names <- c("Factor1", "Factor2")
factor.pvalue <- c("SigF1", "SigF2")
results <- numeric()
vector <- matrix(0, ncol=(length(factor.names)*2+2), nrow=1)
colnames(vector) <- c("No.obs", factor.names, factor.pvalue)

application <- function(x){

    rownames(vector) <- x$Name

    for(i in 1:length(factor.names)){

        if(dim(x)[1]>=5){
            vector[1] <- dim(x)[1]
            vector[i+1] <- cor.test(x$External, x[,factor.names[i]],
method="kendall")$estimate
            vector[i+3] <- cor.test(x$External, x[,factor.names[i]],
method="kendall")$p.value
        } else {
            vector <- rep(NA, length(vector))
        }
    }
    results <- rbind(results, vector)
}

by(database, database$Symbol, application)

This did not work. I get :
"Error in dimnames(x) <- dn :
  length of 'dimnames' [1] not equal to array extent"

I used browser() and I see that the Name is not assigned to the row
name of vector and then dim(x)[1] does not work.

What am I doing wrong? Do not understand. :-(

Thank you in advance for your help.

Regards,
JM

-- 
Jonas Malmros
Stockholm University
Stockholm, Sweden



More information about the R-help mailing list