[R] Higher Dimensional Matrices

Bill.Venables at csiro.au Bill.Venables at csiro.au
Tue Dec 26 01:59:45 CET 2006


Lars from space [sic] asks:
> -----Original Message-----

> From: r-help-bounces at stat.math.ethz.ch
> [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of downunder
> Sent: Monday, 25 December 2006 12:31 PM To: r-help at stat.math.ethz.ch
> Subject: [R] Higher Dimensional Matrices
> 
> 
> Hi all.
> 
> I want to calculate partial correlations while controlling for one
> or more variables. That works already fine when I control for
> example just for x[,1] and x[,2] that gives me one single
> correlation matrix and i have to to it for x [,1]...x[,10]. That
> would give me out 10 matrices. Controlling for 2 Variables 100
> matrices. how can I run a loop to get f.e the 10 or 100 matrices at
> once? I appreciate for every hint. have nice holidays.

I don't quite understand this.  You have 10 variables and you want to
find the partial correlations controlling for two of them at a time.
If you take each possible set of two variables to condition upon at a
time, this would give you choose(10, 2) = 45 matrices, wouldn't it?
Where do you get '10 or 100' matrices from?

> 
> greetings lars
> 
> x <- read.table("C:/.....dat")
> dim(x) #200x10
> a <- matrix(0,200,10)
> for (i in 1:10)
> a[,i] <- residuals(lm(x[,i]~1+x[,1]+x[,2]))
> b <- matrix(0,200,10)
> for (i in 1:10)
> b[,i] <- residuals(lm(x[,i]~1+x[,1]+x[,2]))
> #a=round(a,5)
> #b=round(b,5)
> d <- cor(a,b)
> d

But a and b are the same, aren't they?  Why do you need to compute
them twice?  Why not just use cor(a, a) [which is the same as cor(a),
of course]?

There is a more serious problem here, though.  Your residuals are
after regression on x[, 1:2] so if you *select* x[, 1:2] as the
y-variables in your regression then the residuals are going to be
zero, but in practice round-off error.  so the first two rows and
columns of d will be correlations with round-off error,
i.e. misleading junk.  It doesn't make sense to include the
conditioning variables in the correlation matrix *conditioning on
them*.  Only the 8 x 8 matrix of the others among themselves is
defined, really.

So how do we do it?  Here are a few pointers.

To start, here is a function that uses a somewhat more compact way of
finding the partial correlations than your method.  Sorting out how it
works should be a useful exercise.

partialCorr <- function (cond, mat) 
	cor(qr.resid(qr(cbind(1, mat[, cond])), mat[, -cond]))

To find the matrix of partial correlations conditioning on x[, 1:2]
you would use

d <- partialCorr(c(1,2), x)

So how to do it for all possible conditioning pairs of variables.
Well you could do it in an obvious loop:

cmats <- array(0, c(8,8,45))
k <- 0
for(i in 1:9) for(j in (i+1):10) {
    k <- k+1
    cmats[, , k] <- partialCorr(c(i, j), x)
}

Now the results are in a 3-way array, but without any kind of labels.
Perhaps you should think about how to fix that one yourself...

With more than two conditioning variables you should probably use a
function to generate the subsets of the appropriate size rather than
trying to write ever more deeply nested loops.  There are plenty of
functions around to do this.

Bill Venables.



More information about the R-help mailing list