[R] Determining a linear model based on a factor

Jason Rupert jasonkrupert at yahoo.com
Fri Oct 16 17:33:35 CEST 2009


I guess I should disclose up front that am not a statistician by schooling, but  I am intersted in getting the terminology correct so please correct it if I butcher it too badly. 

I have been able to very easily build a linear model showing the correlation between two variables, e.g. year built and square footage:
HomeSqFt_lm<-lm(as.numeric(as.character(SqFootage)) ~ as.numeric(as.character(Home_Year_Built)), data=Home_DF)
summary(HomeSqFt_lm)

I would like to, however, be able to use lm to produce the a linear model using the same variables for different neighborhoods, e.g. square footage vs. build year for neighborhood 1, etc. 

Is that possible using the lm() command?    An example of my dataset is shown below.  


sample_size<-200

Home_SqFootage<-sample(1200:3600, size=sample_size, rep=T)
Home_Year_Built<-sample(1989:2008, size=sample_size, rep=T)
Home_Year_Sold<-sample(1989:2008, size=sample_size, rep=T)
Neighborhood<-sample(1:4, size=sample_size, rep=T)

Home_DF<-data.frame(SqFootage=Home_SqFootage, YearBuilt=as.character(Home_Year_Built), YearSold=as.character(Home_Year_Sold), Neighborhood=Neighborhood)




More information about the R-help mailing list