[R] randomForests predict problem

Torsten Hothorn Torsten.Hothorn at rzmail.uni-erlangen.de
Wed Apr 2 15:46:05 CEST 2003


On Wed, 2 Apr 2003, Liaw, Andy wrote:

> Yves,
>
> I will add checks for NAs in predict.randomForest().
>
> In the next version of randomForest (currently called 3.9-x), there will be
> facilities for handling NAs in the training set.  However, there's no way to
> handle NAs in the test set yet.  I believe Leo is still working on that.
>
> In Leo's v.4 of the Fortran code, he uses proximity from random forest to
> iteratively impute NAs, starting with column median or mode (depending on
> variable types).  I've implemented this scheme at the R level, so that it
> works for both regression and classification.
>
> There are a couple of things in Leo's new code that I have not added to the
> package, and that's why the version is 3.9 rather than 4.0.  If you would
> like to test the new code, please let me know.

yes, sure!

best,

Torsten

>
> Cheers,
> Andy
>
> > -----Original Message-----
> > From: Yves Brostaux [mailto:brostaux.y at fsagx.ac.be]
> > Sent: Wednesday, April 02, 2003 8:34 AM
> > To: r-help at stat.math.ethz.ch
> > Cc: Liaw, Andy; Torsten Hothorn
> > Subject: RE: [R] randomForests predict problem
> >
> >
> > I use randomForest version 3.4-4, but yes, now I correctly
> > omitted NA's it
> > works. I should have made a mistake while removing them first time.
> >
> > I was surprised that this method doesn't have another way to
> > deal with NA's
> > than omitting them. As Torsten Hothorn suggested, the
> > associated predict
> > function should then check for NA's in newdata, shouldn't it ?
> >
> > Thank you both for your answers !
> >
> > At 15:12 02/04/03, Liaw, Andy wrote:
> > >Yves,
> > >
> > >Which version of the package are you using?  I get:
> > >
> > > > soy <- na.omit(Soybean)
> > > > ts <- sample(nrow(soy), 150, replace=FALSE)
> > > > sb.rf <- randomForest(Class ~ ., data=soy[-ts,])
> > > > table(predict(sb.rf, soy[ts,], type="class"))
> > >
> > >                2-4-d-injury         alternarialeaf-spot
> > >                           0                          37
> > >                 anthracnose            bacterial-blight
> > >                          10                           3
> > >           bacterial-pustule                  brown-spot
> > >                           2                          29
> > >              brown-stem-rot                charcoal-rot
> > >                          11                           7
> > >               cyst-nematode diaporthe-pod-&-stem-blight
> > >                           0                           0
> > >       diaporthe-stem-canker                downy-mildew
> > >                           4                           8
> > >          frog-eye-leaf-spot            herbicide-injury
> > >                          17                           0
> > >      phyllosticta-leaf-spot            phytophthora-rot
> > >                           3                           5
> > >              powdery-mildew           purple-seed-stain
> > >                           4                           5
> > >        rhizoctonia-root-rot
> > >                           5
> > >
> > >Cheers,
> > >Andy
> > >
> > > > -----Original Message-----
> > > > From: Yves Brostaux [mailto:brostaux.y at fsagx.ac.be]
> > > > Sent: Wednesday, April 02, 2003 4:46 AM
> > > > To: r-help at stat.math.ethz.ch
> > > > Subject: [R] randomForests predict problem
> > > >
> > > >
> > > > Hello everybody,
> > > >
> > > > I'm testing the randomForest package in order to do some
> > > > simulations and I
> > > > get some trouble with the prediction of new values. The
> > random forest
> > > > computation is fine but each time I try to predict values
> > > > with the newly
> > > > created object, I get an error message. I thought I was
> > > > because NA values
> > > > in the dataframe, but I cleaned them and still got the same
> > > > error. What am
> > > > I doing wrong ?
> > > >
> > > >  > library(mlbench)
> > > >  > library(randomForest)
> > > >  > data(Soybean)
> > > >  > test <- sample(1:683, 150, replace=F)
> > > >  > sb.rf <- randomForest(Class~., data=Soybean[-test,])
> > > >  > sb.rf.pred <- predict(sb.rf, Soybean[test,])
> > > > Error in matrix(t1$countts, nr = nclass, nc = ntest) :
> > > >          No data to replace in matrix(...)
> > > >
> > > > I did it the same way with rpart and all worked fine :
> > > >  > library(rpart)
> > > >  > sb.rp <- rpart(Class~., data=Soybean[-test,])
> > > >  > sb.rp.pred <- predict(sb.rp, Soybean[test,], type="class")
> > > >
> > > > Thank you all for any advice you can give to me.
> > > >
> > > > --
> > > > Ir. Yves Brostaux - Statistics and Computer Science Dpt.
> > > > Gembloux Agricultural University
> > > > 8, avenue de la Faculté B-5030 Gembloux (Belgium)
> > > > Tél : +32 (0)81 62 24 69
> > > > E-mail : brostaux.y at fsagx.ac.be
> > > > Web : http://www.fsagx.ac.be/si/
> > > >
> > > > ______________________________________________
> > > > R-help at stat.math.ethz.ch mailing list
> > > > https://www.stat.math.ethz.ch/mailman/listinfo/r-help
> > > >
> > >
> > >-------------------------------------------------------------
> > -----------------
> > >Notice: This e-mail message, together with any attachments, contains
> > >information of Merck & Co., Inc. (Whitehouse Station, New
> > Jersey, USA)
> > >that may be confidential, proprietary copyrighted and/or legally
> > >privileged, and is intended solely for the use of the
> > individual or entity
> > >named on this message.  If you are not the intended
> > recipient, and have
> > >received this message in error, please immediately return
> > this by e-mail
> > >and then delete it.
> > >
> > >=============================================================
> > =================
> >
> >
>
>
> ------------------------------------------------------------------------------
> Notice: This e-mail message, together with any attachments, contains information of Merck & Co., Inc. (Whitehouse Station, New Jersey, USA) that may be confidential, proprietary copyrighted and/or legally privileged, and is intended solely for the use of the individual or entity named on this message. If you are not the intended recipient, and have received this message in error, please immediately return this by e-mail and then delete it.
>
> ==============================================================================
>
>



More information about the R-help mailing list