[R] Random Forest

Liaw, Andy andy_liaw at merck.com
Wed Mar 10 18:34:10 CET 2010


Thanks for providing the code that allows me to reproduce the problem.
It looks like the prediction routine for some reason returns "0" as
prediction for some trees, thus causing the problem observed.  I'll look
into it.

Andy  

From: Dror
> 
>  Hi,
> Thank you for your replies
> as for the prediction length, i run this code:
> "
> library(arules)
> data(AdultUCI)
> AdultUCI$workclass<-factor(AdultUCI$workclass, levels =
> c(levels(AdultUCI$workclass), "UNKNOWN"))
> AdultUCI$workclass[is.na(AdultUCI$workclass)]<-"UNKNOWN"
> AdultUCI$occupation<-factor(AdultUCI$occupation, levels =
> c(levels(AdultUCI$occupation), "UNKNOWN"))
> AdultUCI$occupation[is.na(AdultUCI$occupation)]<-"UNKNOWN"
> AdultUCI$"native-country"<-factor(AdultUCI$"native-country", levels =
> c(levels(AdultUCI$"native-country"), "UNKNOWN"))
> AdultUCI$"native-country"[is.na(AdultUCI$"native-country")]<-"UNKNOWN"
> tt<-sort(table(AdultUCI$"native-country"))
> AdultUCI$"native-country"<-factor(AdultUCI$"native-country", levels =
> c(levels(AdultUCI$"native-country"), "OTHER"))
> AdultUCI$"native-country"[!is.na(match(AdultUCI$"native-countr
y",names(tt[1:11])))]<-"OTHER"
> drop=names(tt[1:11])
> AdultUCI$"native-country"<-factor(AdultUCI$"native-country", levels =
> levels(AdultUCI$"native-country")[-which(levels(AdultUCI$"nati
> ve-country")
> %in% drop)])
> names(AdultUCI)=c("age","WorkClass","fnlwgt","Education","educ
> ationNum","maritalStatus","occupation","relationship","race","
> sex","capitalGain","capitalLoss","hoursPerWeek","nativeCountry
> ","YearIncome")
> adult<-AdultUCI[1:32561,]
> library(randomForest)
> frf=formula(YearIncome~age+WorkClass+fnlwgt+Education
> 		+educationNum+maritalStatus+occupation
> 		+relationship+race+sex+capitalGain+capitalLoss
> 		+hoursPerWeek+nativeCountry)
> 
> {RF=randomForest(data=adult
> 	,frf
> 	,ntree=200
> 	,mtry=4
> 	,keep.forest=TRUE)
> }
> pr<-predict(RF,adult[1,],predict.all=TRUE)
> "
> and get this results:
> 
> > str(pr)
> List of 2
>  $ aggregate : Factor w/ 2 levels "small","large": 1
>  $ individual: chr [1, 1:198] "small" "small" "small" "small" ...
>   ..- attr(*, "dimnames")=List of 2
>   .. ..$ : chr "1"
>   .. ..$ : NULL
> > 
> why don't i have 200 votes?
> Thanks,
> Dror
> -- 
> View this message in context: 
> http://n4.nabble.com/Random-Forest-tp1557464p1587592.html
> Sent from the R help mailing list archive at Nabble.com.
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 
Notice:  This e-mail message, together with any attachme...{{dropped:10}}



More information about the R-help mailing list