[R] using 'apply' to apply princomp to an array of datasets

David Romano dromano at stanford.edu
Wed Dec 12 18:27:05 CET 2012


Sorry, I just realized I didn't send the message below in plain text.
-David Romano

On Wed, Dec 12, 2012 at 9:14 AM, David Romano <dromano at stanford.edu> wrote:
>
> Hi everyone,
>
> Suppose I have a 3D array of datasets, where say dimension 1 corresponds
> to cases, dimension 2 to datasets, and dimension 3 to observations within a
> dataset.  As an example, suppose I do the following:
>
> > x <- sample(1:20, 48, replace=TRUE)
> > datasets <- array(x, dim=c(4,3,2))
>
> Here, for each j=1,2,3, I'd like to think of datasets[,j,] as a single
> data matrix with four cases and two observations.  Now, I'd like to be able
> to do the following: apply pca to each dataset, and create a matrix of the
> first principal component scores.
>
> In this example, I could do:
>
> > pcl<-apply(datasets,2,princomp)
>
> which yields a list of princomp output, one for each dataset, so that the
> vector of first principal component scores for dataset 1 is obtained by
>
> > score1set1 <- pcl[[1]]$scores[,1]
>
> and I could then obtain the desired matrix by
>
> > score1matrix <- cbind( score1set1, score1set2, score1set3)
>
>
> So my first question is: 1) how could I use *apply to do this?  I'm having
> trouble because pcl is a list of lists, so I can't use, say, do.call(cbind,
> ...) without first having a list of the first component score vectors, which
> I'm not sure how to produce.
>
> My second question is: 2) Having answered question 1), now suppose there
> may be datasets containing NA value -- how could I select the subset of
> values from dimension 2 corresponding to the datasets for which this is true
> (again using *apply?)?
>
> Thanks in advance for any light you might be able to shed on these
> questions!
>
> David Romano




More information about the R-help mailing list