[R] linear regression on groups of consecutive rows of a matrix

Jim Bouldin jrbouldin at ucdavis.edu
Tue Nov 24 22:06:59 CET 2009


> But I do feel compelled to ask: Do you really get meaningful  
> information from lm applied to 5 cases? Especially when the predictors  
> used may not be the same from subset to subset???

Thanks again for your help David.  Your question is a good one. It's a bit
complicated but here's the basics. The predictors are the same between
subsets, in the sense that, for each group of rows (which represent tree
ring years), the predictors and predictands are always from the same set of
trees, even though that set changes slightly between consecutive subsets. 
Typically there will be 20+ observations per year (row), so for 5 rows I
have n = 100+.  For my purposes (removing the effect of tree size on ring
width for small groups of years) that is more than good enough.

Now to try out your suggestion...
Jim


> 
> -- 
> David
> 
> On Nov 24, 2009, at 3:25 PM, Jim Bouldin wrote:
> 
> >
> > I want to perform linear regression on groups of consecutive rows-- 
> > say 5 to
> > 10 such--of two matrices.  There are many such potential groups  
> > because the
> > matrices have thousands of rows. The matrices are both of the form:
> >
> >> shp[1:5,16:20]
> >      SL495B SL004C SL005C SL005A SL017A
> > -2649   1.06   0.56     NA     NA     NA
> > -2648   0.97   0.57     NA     NA     NA
> > -2647   0.46   0.30     NA     NA     NA
> > -2646   0.92   0.48     NA     NA     NA
> > -2645   0.82   0.48     NA     NA     NA
> >
> > That is, they both have NA values, and non-NA values, in the same  
> > matrix
> > positions.  In my attempts so far, I have had two problems.  First, in
> > using the split function (which I assume is essential here), I am  
> > unable to
> > split the matrices by groups of rows (say rows 1 to 5, 6 to 10, etc):
> >
> >> shp_split = split(shp,row(shp))
> >
> > will split the matrix by rows but not by groups thereof. Stumped.
> >
> > Second, I cannot seem to get rid of the NA values, which would  
> > prevent the
> > regression even is I could figure out how to split the matrices  
> > correctly,
> > e.g.:
> >> shp_split = split(shp,row(shp))
> >> shp_split = shp_split[!is.na(shp_split)]
> >> shp_split[1]
> > $`1`
> >  [1] 0.68 0.28 0.43 0.47 0.64 0.40 0.69 0.56 0.62 0.40 1.01 0.67  
> > 0.17 1.36
> > 1.84 1.06 0.56   NA   NA   NA   NA   NA   NA   NA   NA   NA   NA    
> > NA   NA
> >  NA   NA   NA etc
> >
> > IF I solve these problems, will I in fact be able to perform  
> > individual
> > linear regressions on the (numerous) collections of 5 to 10 rows?
> >
> > Thanks as always for any insight.
> >
> >
> > Jim Bouldin
> > Research Ecologist
> > Department of Plant Sciences, UC Davis
> > Davis CA, 95616
> > 530-554-1740
> >
> > ______________________________________________
> > R-help at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> 
> David Winsemius, MD
> Heritage Laboratories
> West Hartford, CT
> 
> 

Jim Bouldin, PhD
Research Ecologist
Department of Plant Sciences, UC Davis
Davis CA, 95616
530-554-1740




More information about the R-help mailing list