[R] Problems with subset, droplevels and lm: variable lengths differ

Michael Friendly friendly at yorku.ca
Mon Apr 16 19:43:43 CEST 2012


[Env:  R 2.14.2 / Win Xp]

In the script below, I want to select some variables from 
rrcov::OsloTransect, delete cases with
any missing data, and subset the data frame Oslo to remove cases for two 
levels of the
factor litho that occur with low frequency.

The checks I run on my new data frame Oslo look OK, but I when I try to 
fit a multivariate
linear model with lm(), I am getting an error: variable lengths differ 
(found for 'litho').
How can I fix this?

 > data(OsloTransect, package="rrcov")
 > # keep a subset of variables & rename some variables
 > Oslo <-OsloTransect[c("X.ID", "XCOO", "YCOO", "X.FOREST", 
"X.WEATHER", "X.FLITHO", "ALT")]
 > colnames(Oslo) <- c("site", "XC", "YC", "forest", "weather", "litho", 
"altitude")
 > Oslo <- cbind(Oslo, OsloTransect[,c("Cu", "Fe", "K", "Mg", "Mn", "P", 
"Zn")])
 > # make site a factor
 > Oslo[,"site"] <- factor(Oslo[,"site"])
 >
 > # log transform the chemical elements
 > Oslo[,8:14] <- log(Oslo[,8:14])
 >
 > # delete cases with missing data
 > Oslo <- Oslo[complete.cases(Oslo),]
 > nrow(Oslo)
[1] 350
 >
 > # delete low frequency litho=="GNEID_O" | "MICSH"
 > Oslo <- subset(Oslo, !litho %in% c("GNEID_O", "MICSH"), drop=TRUE)
 > nrow(Oslo)
[1] 332
 > Oslo<- droplevels(Oslo)
 > table(Oslo$litho)

  CAMSED GNEIS_O GNEIS_R    MAGM
      98      89      32     113
 > nrow(Oslo)
[1] 332
 > mod1 <- lm(cbind("Cu", "Fe", "K", "Mg", "Mn", "P", "Zn") ~ litho + 
forest + weather, data=Oslo)
Error in model.frame.default(formula = cbind("Cu", "Fe", "K", "Mg", 
"Mn",  :
   variable lengths differ (found for 'litho')
 >



-- 
Michael Friendly     Email: friendly AT yorku DOT ca
Professor, Psychology Dept.
York University      Voice: 416 736-5115 x66249 Fax: 416 736-5814
4700 Keele Street    Web:   http://www.datavis.ca
Toronto, ONT  M3J 1P3 CANADA



More information about the R-help mailing list