[R] Odp: Problem calculating multiple regressions on a data frame.

Petr PIKAL petr.pikal at precheza.cz
Tue Apr 27 14:13:55 CEST 2010


Hi

what about

fit <- lm(value~seqMonth+ids+varable, data=theTestLineal)

or similar approach using

?lme

See also

?interaction

Regards
Petr

r-help-bounces at r-project.org napsal dne 27.04.2010 13:48:32:

> Hi there,
> I am stuck trying to solve what should be a fairly easy problem.
> I have a data frame that essentially consists of (ID, time as seqMonth,
> variable, value) and i want to find the regression coefficient of value 
vs
> time for each combination of ID and Variable.
> I have tried several approaches and none of them seems to work as i
> expected.
> For example, i have tried:
> 
> theSplit<-split(theTestLineal, list(as.factor(theTestLineal $ids),
> as.factor(theTestLineal $variable)))
> 
> I can then use 
> lm(value~seqMonth,data=zongSplit[[1]])
> ...
> lm(value~seqMonth,data=zongSplit[[4]])
> 
> that works well, (it fails for some combinations of ID and variable 
where
> there is one datapoint)
> 
> however when i try to use an lapply:
> 
lapply(zongSplit,function(x)lm(value~seqMonth,data=x,na.action=na.exclude))
> 
> it fails, with error message:
> Error in lm.fit(x, y, offset = offset, singular.ok = singular.ok, ...) :
>  0 (non-NA) cases
> 
> I have tried to change the na.action with no success (na.pass, na.fail,
> na.exclude... all give the same error message)
> 
> 
> I have also tried to follow the approach suggested by Charles Sharpsteen
> (http://www.mail-archive.com/r-help@r-project.org/msg74759.html) with
> similar results.
> The code is as follows:
> theModels <- by( theTestLineal, list( theTestLineal$ids,
> zongTestLineal$variable), function( dataSlice ){
> linMod <- lm( value ~ seqMonth, data = dataSlice )
> 
> # Slope and intercept may be recovered from the output of the coef()
> function:
> intercept <- coef( linMod )[1]
> slope <- coef( linMod )[2]
> 
> # The R-Squared value is returned by the summary() function:
> rsq <- summary( linMod )[[ 'r.squared' ]]
> 
> # The summary function also provides statistics for the F-distribution,
> # extract them, reformat as a list, rename and feed to pf() using 
do.call()
> # in order to get the p-value:
> fStats <- as.list( summary( linMod )[[ 'fstatistic' ]] )
> names( fStats ) <- c( 'q', 'df1', 'df2' )
> fStats[[ 'lower.tail' ]] <- FALSE
> 
> pVal <- do.call( pf, fStats )
> 
> return(data.frame( slope, intercept, rsq, pVal ))
> })
> 
> Any help will be appreciated!
> 
>    [[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list