[R] tagging results of "apply"

Bernzweig, Bruce (Consultant) bbernzwe at bear.com
Mon Jul 23 14:38:01 CEST 2007


Thanks for the clarification and help!

-----Original Message-----
From: Stephen Tucker [mailto:brown_emu at yahoo.com] 
Sent: Sunday, July 22, 2007 6:08 AM
To: Bernzweig, Bruce (Consultant); r-help
Subject: Re: [R] tagging results of "apply"

Actually if you want to tag both column and row, this might also help:

## Give dimension labels to both matrices
mat1 <- matrix(sample(1:500, 25), ncol = 5,
               dimnames=list(paste("mat1row",1:5,sep=""),
                 paste("mat1col",1:5,sep="")))
mat2 <- matrix(sample(501:1000, 25), ncol = 5,
               dimnames=list(paste("mat2row",1:5,sep=""),
                 paste("mat2col",1:5,sep="")))

cor(mat1[1,],mat2)
        mat2col1   mat2col2   mat2col3  mat2col4     mat2col5
[1,] -0.06313535 -0.4679927 -0.5147084 -0.797748 -0.001457972

The column labels are there but are lost when returned from apply(), as
it
says in ?apply:

"In all cases the result is coerced by as.vector to one of the basic
vector
types before the dimensions are set"

> as.vector(cor(mat1[1,],mat2))
[1] -0.063135353 -0.467992672 -0.514708392 -0.797748010 -0.001457972

You lose the dimension labels in this case, so one option is to guard
against
this in the following way:

> as.vector(as.data.frame(cor(mat1[1,],mat2)))
     mat2col1   mat2col2   mat2col3  mat2col4     mat2col5
1 -0.06313535 -0.4679927 -0.5147084 -0.797748 -0.001457972

Unfortunately, if you use 'as.data.frame()' in 'function(x)', apply will
return a list - but you can bind the rows of the output:

> f <- function(x,y) as.data.frame(cor(x,y))
> do.call(rbind, apply(mat1,1,f,y=mat2))
            mat2col1   mat2col2    mat2col3   mat2col4     mat2col5
mat1row1 -0.06313535 -0.4679927 -0.51470839 -0.7977480 -0.001457972
mat1row2 -0.28750363  0.1681777  0.14671484  0.8139768  0.039982028
mat1row3 -0.62017387 -0.6932731 -0.72263865 -0.7929604  0.427366680
mat1row4  0.06441894  0.1707946 -0.11444747 -0.8213577  0.526239013
mat1row5 -0.09849051  0.7024540 -0.01997228  0.3712480  0.439037838

The result is a data frame, not a matrix, and note that the columns/rows
are
transposed in relation to the output of
  apply(mat1,1,f,y=mat2)

An alternative is to convert each row of mat1 into a list element [by
transposing it with t() and then feeding it to as.data.frame()] and then
use
sapply():

> sapply(as.data.frame(t(mat1)),f,y=mat2)
         mat1row1     mat1row2   mat1row3   mat1row4   mat1row5   
mat2col1 -0.06313535  -0.2875036 -0.6201739 0.06441894 -0.0984905 
mat2col2 -0.4679927   0.1681777  -0.6932731 0.1707946  0.702454   
mat2col3 -0.5147084   0.1467148  -0.7226387 -0.1144475 -0.01997228
mat2col4 -0.797748    0.8139768  -0.7929604 -0.8213577 0.371248   
mat2col5 -0.001457972 0.03998203 0.4273667  0.526239   0.4390378



--- Stephen Tucker <brown_emu at yahoo.com> wrote:

> Dear Bruce,
> In your functions, you need to use your bound variable, 'x' [not mat1]
in
> your anonymous function [function(x)] as the argument to cor().
> 
> For instance, you wrote:
> apply(mat1, 1, function(x) cor(mat1, mat2[1,]))
> apply(mat1, 1, function(x) cor(mat1, mat2))
> 
> They should be
> apply(mat1, 1, function(x) cor(x, mat2[1,]))
> apply(mat1, 1, function(x) cor(x, mat2))
> 
> or
> f <- function(x,y) cor(x, y)
> apply(mat1, 1, f, y=mat2[1,])
> apply(mat1, 1, f, y=mat2)
> 
> Then from the ?apply documentation - under section, 'Value' - the
following
> statement will help you predict its behavior in this case:
> "If each call to FUN returns a vector of length n, then apply returns
an
> array of dimension c(n, dim(X)[MARGIN]) if n > 1."
> 
> [each column of your output is the output from cor(mat1[i,],mat2) in
> Scenario
> 2]. As for tagging, you can try adding dimension labels [to the object
> which
> is passed as the 'X' argument to apply()]:
> 
> mat1 <- matrix(sample(1:500, 25), ncol = 5,
>                dimnames=list(paste("row",1:5,sep=""),
>                  paste("col",1:5,sep="")))
> mat2 <- matrix(sample(501:1000, 25), ncol = 5)
> 
> > apply(mat1, 1, function(x,y) cor(x, y), y=mat2)
>             row1       row2       row3        row4        row5
> [1,]  0.39412464 -0.6241649  0.7423724  0.48391875  0.27085386
> [2,] -0.22912466 -0.4123714  0.2857004 -0.52447327  0.06971423
> [3,] -0.51027247  0.3256587 -0.6195050 -0.48309737  0.01699978
> [4,]  0.26353316 -0.1873564  0.2121154  0.88784766 -0.02257890
> [5,] -0.03771225 -0.4250040  0.3795558 -0.03372794 -0.05874675
> 
> Hope this helps,
> 
> Stephen
> 
> --- "Bernzweig, Bruce (Consultant)" <bbernzwe at bear.com> wrote:
> 
> > In trying to get a better understanding of vectorization I wrote the
> > following code:
> > 
> > My objective is to take two sets of time series and calculate the
> > correlations for each combination of time series.
> > 
> > mat1 <- matrix(sample(1:500, 25), ncol = 5)
> > mat2 <- matrix(sample(501:1000, 25), ncol = 5)
> > 
> > Scenario 1:
> > apply(mat1, 1, function(x) cor(mat1, mat2[1,]))
> > 
> > Scenario 2:
> > apply(mat1, 1, function(x) cor(mat1, mat2))
> > 
> > Using scenario 1, (output below) I can see that correlations are
> > calculated for just the first row of mat2 against each individual
row of
> > mat1.
> > 
> > Using scenario 2, (output below) I can see that correlations are
> > calculated for each row of mat2 against each individual row of mat1.

> > 
> > Q1: The output of scenario2 consists of 25 rows of data.  Are the
first
> > five rows mat1 against mat2[1,], the next five rows mat1 against
> > mat2[2,], ... last five rows mat1 against mat2[5,]?
> > 
> > Q2: I assign the output of scenario 2 to a new matrix
> > 
> > 	matC <- apply(mat1, 1, function(x) cor(mat1, mat2))
> > 
> >     However, I need a way to identify each row in matC as a pairing
of
> > rows from mat1 and mat2.  Is there a parameter I can add to apply to
do
> > this?
> > 
> > Scenario 1:
> > > apply(mat1, 1, function(x) cor(mat1, mat2[1,]))
> >            [,1]       [,2]       [,3]       [,4]       [,5]
> > [1,] -0.4626122 -0.4626122 -0.4626122 -0.4626122 -0.4626122
> > [2,] -0.9031543 -0.9031543 -0.9031543 -0.9031543 -0.9031543
> > [3,]  0.0735273  0.0735273  0.0735273  0.0735273  0.0735273
> > [4,]  0.7401259  0.7401259  0.7401259  0.7401259  0.7401259
> > [5,] -0.4548582 -0.4548582 -0.4548582 -0.4548582 -0.4548582
> > 
> > Scenario 2:
> > > apply(mat1, 1, function(x) cor(mat1, mat2))
> >              [,1]        [,2]        [,3]        [,4]        [,5]
> >  [1,]  0.19394126  0.19394126  0.19394126  0.19394126  0.19394126
> >  [2,]  0.26402400  0.26402400  0.26402400  0.26402400  0.26402400
> >  [3,]  0.12923842  0.12923842  0.12923842  0.12923842  0.12923842
> >  [4,] -0.74549676 -0.74549676 -0.74549676 -0.74549676 -0.74549676
> >  [5,]  0.64074122  0.64074122  0.64074122  0.64074122  0.64074122
> >  [6,]  0.26931986  0.26931986  0.26931986  0.26931986  0.26931986
> >  [7,]  0.08527921  0.08527921  0.08527921  0.08527921  0.08527921
> >  [8,] -0.28034079 -0.28034079 -0.28034079 -0.28034079 -0.28034079
> >  [9,] -0.15251915 -0.15251915 -0.15251915 -0.15251915 -0.15251915
> > [10,]  0.19542415  0.19542415  0.19542415  0.19542415  0.19542415
> > [11,]  0.75107032  0.75107032  0.75107032  0.75107032  0.75107032
> > [12,]  0.53042767  0.53042767  0.53042767  0.53042767  0.53042767
> > [13,] -0.51163612 -0.51163612 -0.51163612 -0.51163612 -0.51163612
> > [14,] -0.44396048 -0.44396048 -0.44396048 -0.44396048 -0.44396048
> > [15,]  0.57018745  0.57018745  0.57018745  0.57018745  0.57018745
> > [16,]  0.70480284  0.70480284  0.70480284  0.70480284  0.70480284
> > [17,] -0.36674283 -0.36674283 -0.36674283 -0.36674283 -0.36674283
> > [18,] -0.81826607 -0.81826607 -0.81826607 -0.81826607 -0.81826607
> > [19,]  0.53145184  0.53145184  0.53145184  0.53145184  0.53145184
> > [20,]  0.24568385  0.24568385  0.24568385  0.24568385  0.24568385
> > [21,] -0.10610402 -0.10610402 -0.10610402 -0.10610402 -0.10610402
> > [22,] -0.78650748 -0.78650748 -0.78650748 -0.78650748 -0.78650748
> > [23,]  0.04269423  0.04269423  0.04269423  0.04269423  0.04269423
> > [24,]  0.14704698  0.14704698  0.14704698  0.14704698  0.14704698
> > [25,]  0.28340166  0.28340166  0.28340166  0.28340166  0.28340166
> > 
> > 
> > 
> >
**********************************************************************
> > Please be aware that, notwithstanding the fact that the
> pers...{{dropped}}
> > 
> > ______________________________________________
> > R-help at stat.math.ethz.ch mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 



       
________________________________________________________________________
____________
Get the Yahoo! toolbar and be alerted to new email wherever you're
surfing.
http://new.toolbar.yahoo.com/toolbar/features/mail/index.php




**********************************************************************
Please be aware that, notwithstanding the fact that the pers...{{dropped}}



More information about the R-help mailing list