[R] linear regression "group by"

Michael Dewey info at aghmed.fsnet.co.uk
Wed Apr 16 18:52:15 CEST 2008


At 18:03 14/04/2008, Ryan Lauritsen wrote:
>Hi all.  I'm brand new to R.
>
>My dataset (stored in MySQL) is a list of weather stations in rows by
>year with various weather variables in columns, for example:
>STNID     YEAR  TEMP  DEWP
>station1    1990   54       50
>station1    1991   23       10
>station1    1992   34       18
>station2    1990   45       41
>station2    1991   32       25
>station2    1992   21       11
>
>I'm trying to run linear regression and get the basic output (i.e.
>intercept, slope, and significance) for each station.  I'm able to run
>the regression on the entire dataset using:
>lm(TEMP~DEWP, data=select)
>But is there a way to aggregate the data ("group by" in MySQL) by
>STNID during the regression?  Ideally I would just have a list of
>stations and their approriate summary output, which I could use for
>further analysis.

In this particular case you might consider using lmList from the nlme 
package (or from lme4).

More generally you could look at the family of apply functions: 
apply, tapply, sapply, and so on.


>I've searched the manual, etc. for solutions, but have been
>unsuccessful.  Any assistance is greatly appreciated.
>
>Thank you,
>Ryan

Michael Dewey
http://www.aghmed.fsnet.co.uk



More information about the R-help mailing list