[R] missing value replacement for test data in random forest

Liaw, Andy andy_liaw at merck.com
Thu Mar 30 02:55:30 CEST 2006


The current randomForest package stores the entire proximity matrix (30000 x
30000 in your case), which is needed for imputation.  Breiman and Cutler's
Fortran code stores the largest nrnn element of each row.  If you really
want to do it that way, use the Fortran version.

Andy

From: Zhu Ailing
> 
> HI, 
>   I have data set of 30000 record with 162 features. when I 
> try to fill out the missing values using rfImpute(), I got 
> kicked out becz:  
> tra.imputed <- rfImpute(tra.na[,-163],tra.na[,163],iter=5,ntree=10)
> Error in matrix(0, n, n) : cannot allocate vector of length 900000000
>  
> I wonder how to set the parameters(nrnn) for computing the 
> PROXIMITY. thanks iris
> 
> 	-----Original Message----- 
> 	From: Zhu Ailing 
> 	Sent: Wed 3/29/2006 11:41 AM 
> 	To: 'r-help at lists.R-project.org' 
> 	Cc: 
> 	Subject: missing value replacement for test data in 
> random forest
> 	
> 	
> 	Hi,
> 	 
> 	In R, how to do missing value replacement for test data 
> in randome forest in the way Breiman decribed.
> 	 
> 	thanks in advance
> 	 
> 	iris
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list 
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html
> 
>




More information about the R-help mailing list