[R] building generic regression tool - error invalid type (list) for variable

Brianna Noland brianna.noland at gmail.com
Fri Dec 5 06:01:36 CET 2014


Hello,

First I apologize in advance for for my limited knowledge of R. For an R
course I am writing a generic tool which will execute models based on
inputs. When I attempt to run one of the input models, I get the error:

    invalid type (list) for variable 'MPG'

which has been described here:
http://r.789695.n4.nabble.com/Error-invalid-type-list-for-variable-when-using-lm-td3045462.html
and
http://stats.stackexchange.com/questions/70990/error-with-lm-common-r-mistake

however, these have not helped. Let me describe in detail. I have code
which takes a list of models, input data, performs a regression for each
model.
Since this is generic, I build up a data frame based on which variables the
model
includes and then build a formula regression the independent variable,
constant, over
the independent variables for this particular model.

The following code:

     print(disjointSet(names(data), depVariable))
     print(depVariable)
     formula <- reformulate(termlabels = disjointSet(names(data),
depVariable), response = depVariable)
     print(Reduce(paste, deparse(formula)))

has the following output:

    [1] "VOL" "HP"
    [1] "MPG"
    [1] "MPG ~ VOL + HP"

As you can see, this particular dataset is related to automobiles and MPG
is the dependent variable.
For one model, VOL and HP are independent variables.

The following two lines:

     print(class(data))
     print(data)

result in:

    [1] "data.frame"
       VOL  HP  MPG
    1   89  49 65.4
    2   92  55   56
... (remaining data at http://pastebin.com/3Mnwn1Ve)

Of course the error occurs when I call lm:

    lm(formula, data = data)

Here is the full error:

    Error in model.frame.default(formula = formula, data = data,
drop.unused.levels = TRUE) :
      invalid type (list) for variable 'MPG'
    7: model.frame.default(formula = formula, data = data,
drop.unused.levels = TRUE)
    6: stats::model.frame(formula = formula, data = data,
drop.unused.levels = TRUE)
    5: eval(expr, envir, enclos)
    4: eval(mf, parent.frame())
    3: lm(formula, data = data) at #15
    2: fitness(filteredData, depVariable, regType, criterion) at #42
    1: runModels(parent_pop, depVariable = "MPG", regType = "LM", criterion
= "AIC")

Do you have any ideas of what my mistake might be?

	[[alternative HTML version deleted]]



More information about the R-help mailing list