[R] Variable 'A' is not a factor Error message

S Ellison S.Ellison at LGCGroup.com
Wed Nov 16 20:01:27 CET 2016


This looks like one of those 'please talk to a statistician' questions ...

You appear to have requested a 12-run placket-burman experiment, which is a design that requires up to 11 two-level factors. You then fitted (I think) simulated data to that design, using those factors converted to their integer representation - which is a completely different thing in model matrix terms. So your predictors are still two-level factors,  and your linear model has, after building a model matrix from the default contrasts,  probably got one coefficient for each upper level of your two level factors and one for the intercept. (I'm assuming default contrasts here).

You then decided to redefine your predictors as numeric variables with a very large number of levels given by rnorm, none of which (except by very rare coincidence)  were in your original design. So your model cannot possibly predict the output - it has no coefficients for all those new levels. To avoid that, R quite accurately told you that you can't do that.

If you want to fit a linear model with continuous variables, you need to set up your DOE data frame with (meaningful) numeric predictors, not factors. You will then get a numerical gradient for each factor instead of a single offset for each upper level. That isn't really what Placket and Burman had in mind, so I would not normally start with a P-B design if I wanted to do that. Consider a response surface model instead.

S Ellison


> -----Original Message-----
> From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of ahmed
> meftah
> Sent: 11 November 2016 17:52
> To: r-help at r-project.org
> Subject: [R] Variable 'A' is not a factor Error message
> 
> I am running a DOE with the following code   library(Rcmdr)
>     library(RcmdrMisc)
>     library(RcmdrPlugin.DoE)
> # Define plackett burman experiment
>     PB.DOE <- pb(nruns= 12 ,n12.taguchi= FALSE ,nfactors= 12 -1, ncenter= 0 ,
>                  replications= 1 ,repeat.only= FALSE ,randomize= TRUE ,seed= 27241 ,
>                  factor.names=list( A=c(100,1000),B=c(100,200),C=c(1,3),D=c(1,1.7),
>                                     E=c(1000,1500),G=c(-2,2) ) )
> 
>     as.numeric2 <- function(x) as.numeric(as.character(x))
> 
> # Calculate response column
>     IP <- with(PB.DOE,(as.numeric2(A)*as.numeric2(B)*(5000-
> 3000))/(141.2*as.numeric2(C)*as.numeric2(D)*(log(as.numeric2(E)/0.25)-
> (1/2)+as.numeric2(G))))
> # Combine response column with exp design table
>     final_set <- within(PB.DOE, {
>       IP<- ((as.numeric2(A)*as.numeric2(B)*(5000-
> 3000))/(141.2*as.numeric2(C)*as.numeric2(D)*(log(as.numeric2(E)/0.25)-
> (1/2)+as.numeric2(G))))
>     })I then ran a regression as follows:LinearModel.1 <- lm(IP ~ A + B + C + D +
> E + G,
>                     data=final_set)
> summary(LinearModel.1)Following this i wanted to run a predict using
> specified values as predictors in a Monte Carlo:n = 10000 # Define probability
> distributions of predictors A = rnorm(n,450,100) hist(A,col = "blue",breaks =
> 50)
> 
> B = rnorm(n, 150,10)
> hist(B,col = "blue",breaks = 50)
> 
> C = rnorm(n, 1.5, 0.5)
> hist(C,col = "blue",breaks = 50)
> 
> D = runif(n,1.2,1.7)
> hist(D,col = "blue",breaks = 50)
> 
> E = rnorm(n,1250,50)
> hist(E,col = "blue",breaks = 50)
> 
> G = rnorm(n,0,0.5)
> hist(G,col = "blue",breaks = 50)
> 
> MCtable <- data.frame(A=A,B=B,C=C,D=D,E=E,G=G)
> 
> for (n in 1:n) {
>   N=predict(LinearModel.1,MCtable)
> }
> 
> hist(N,col = "yellow",breaks = 10)I end up getting this error:"Warning in
> model.frame.default(Terms, newdata, na.action = na.action, xlev =
> object$xlevels) :
>   variable 'A' is not a factor"Using str() to get some info on the LinearModel.1
> and from what I understand seems to indicates that since the predictors
> A,B,C etc are factors with 2 levels I have to convert my data.frame table to
> factors aswell. Is that correct?Doing this would mean I would also need to
> specify the number of levels which would mean that since I have set my n to
> 10000 would mean 10000 levels for each factor. How would I go about doing
> this? Is there a better solution? Any help would be appreciated.
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.


*******************************************************************
This email and any attachments are confidential. Any use...{{dropped:8}}



More information about the R-help mailing list