[R] Matrix/dataframe indexing

Marc Schwartz marc_schwartz at comcast.net
Tue Mar 6 18:02:40 CET 2007


On Mon, 2007-03-05 at 12:49 -0500, Guenther, Cameron wrote: 
> Hi all, 
> I am hoping someone can help me out with this:
> 
> If I have dataframe of years and ages and the first column and first row
> are filled with leading values:
> 
> Df<-  	age1	age2	age3
> 	Yr1	1 	0.4 	0.16
>       Yr2	1.5	0	0
> 	Yr3	0.9	0	0
> 	Yr4	1	0	0	
> 	Yr5	1.2	0	0
> 	Yr6	1.4	0	0
> 	Yr7	0.8	0	0
> 	Yr8	0.6	0	0
> 	Yr9	1.1	0	0
> 
> Now the rest of the cells need to be filled according to the previous
> year and age cell so arbitrarily, cell [2,2] should be value in cell
> [1,1] * exp(0.3), and cell [2,3] should be the value in cell [1,2]*
> exp(0.3), etc.
> 
> How do I write the for loop so that it will calculate the missing cell
> values over both dimensions of the dataframe?
> 
> Thanks in advance	

Cameron,

I have not seen a reply to this, but one of the problems that you can
run into is that, depending upon the approach, you can execute the
manipulation on the second column, in effect, before the first column in
the actual source matrix has been updated, due to object subsetting and
copying. 

So, my knee jerk reaction here is to simply do this in two lines of
code, one on the first column and then a separate line for the second
column. I think that this is what you want as an end result:

> DF
    age1 age2 age3
Yr1  1.0  0.4 0.16
Yr2  1.5  0.0 0.00
Yr3  0.9  0.0 0.00
Yr4  1.0  0.0 0.00
Yr5  1.2  0.0 0.00
Yr6  1.4  0.0 0.00
Yr7  0.8  0.0 0.00
Yr8  0.6  0.0 0.00
Yr9  1.1  0.0 0.00


DF[-1, 2] <- DF[-9, 1] * exp(0.3)

> DF
    age1      age2 age3
Yr1  1.0 0.4000000 0.16
Yr2  1.5 1.3498588 0.00
Yr3  0.9 2.0247882 0.00
Yr4  1.0 1.2148729 0.00
Yr5  1.2 1.3498588 0.00
Yr6  1.4 1.6198306 0.00
Yr7  0.8 1.8898023 0.00
Yr8  0.6 1.0798870 0.00
Yr9  1.1 0.8099153 0.00


DF[-1, 3] <- DF[-9, 2] * exp(0.3)

> DF
    age1      age2      age3
Yr1  1.0 0.4000000 0.1600000
Yr2  1.5 1.3498588 0.5399435
Yr3  0.9 2.0247882 1.8221188
Yr4  1.0 1.2148729 2.7331782
Yr5  1.2 1.3498588 1.6399069
Yr6  1.4 1.6198306 1.8221188
Yr7  0.8 1.8898023 2.1865426
Yr8  0.6 1.0798870 2.5509663
Yr9  1.1 0.8099153 1.4576950


I think that the risk inherent in R sometimes is that there can be a
tendency to 'overthink' a problem in either trying to vectorize a
function or in trying to create (or avoid) a loop, when individual code
statements can just "get the job done" quickly and simply, and in many
cases be more 'readable'.

If this was something where you were going to do this repeatedly and
needed to create a function to generalize the approach to matrices where
the dimensions are not known a priori, then it might be worthwhile to
encapsulate the above in a function where dims can be checked, etc.

HTH,

Marc Schwartz



More information about the R-help mailing list