[R] Regression by factor using "sapply"

elh ehansen at usgande.com
Wed Aug 24 19:51:37 CEST 2011


Apologies for the elementary nature of the question (yes, I'm another
newbie)...

I'd like to perform a multiple regression on a single data set containing a
representation of energy consumption and temperatures containing account
number, usage (KWh), heating degree days (HDD) and cooling degree (CDD)
days.  I want to get the coefficients back from the following equation:  
    lm(AvgKWh ~ AvgHDD + AvgCDD, data=usage)

Given that the data set contains the usage of different accounts (e.g. some
large energy users and some small energy users) I do not want to perform the
equation just one time.  Instead, I want to re-calculate the coefficients
(and associated measures of goodness of fit) for each account using the same
equation and return the corresponding coefficients by the account number
identifier.   

I thought I had figured out how to do this using  "by" and "sapply" formula
but I keep getting an error message of: "$ operator is invalid for atomic
vectors"

Here is what I've bee trying to use
# data is stored in a table called "usage"; other than the "ActNo" field,
all the fields are numeric
byDD <- function(data) {lm(AvgKWh~ AvgHDD + AvgCDD, data=data)}
byActNo <- by(usage, usage$ActNo, FUN=byDD)
sapply(byActNo, summary(byActno)$coef)

Thanks in advance!  I'm sure a similar question has been covered somewhere
but everytime I follow the message stream I hit a deadend.

--
View this message in context: http://r.789695.n4.nabble.com/Regression-by-factor-using-sapply-tp3766145p3766145.html
Sent from the R help mailing list archive at Nabble.com.



More information about the R-help mailing list