[R] Problems using rfImpute

James Reilly reilly at stat.auckland.ac.nz
Mon May 5 15:31:21 CEST 2008


The values NA and "NA" are different. The first is treated as missing; 
the second is not. For example,
 > table(factor(c(NA,"0","1","NA","NA")))

  0  1 NA
  1  1  2

I suspect you have "NA" where you want NA, and this is causing your problem.

James
-- 
James Reilly
Department of Statistics, University of Auckland
Private Bag 92019, Auckland, New Zealand

On 6/5/08 1:04 AM, Birgit Lemcke wrote:
> Hello R-user!
> 
> I am running R 2.7.0 on a Power Book (Tiger). (I am still R and 
> statistics beginner)
> 
> I tried rfImpute (randomForest) and as far as I understood should it 
> replace NA`s using a proximity matrix:
> 
>  > set.seed(100000)
>  > Subset5Imputed<-rfImpute(Sex~., data=Subset5)
> ntree      OOB      1      2
> 300:  11.78% 12.36% 11.21%
> ntree      OOB      1      2
> 300:  12.07% 12.64% 11.49%
> ntree      OOB      1      2
> 300:  11.49% 11.21% 11.78%
> ntree      OOB      1      2
> 300:  12.50% 12.93% 12.07%
> ntree      OOB      1      2
> 300:  12.07% 12.36% 11.78%
>  > str(Subset5Imputed)
> 
> 'data.frame':    696 obs. of  24 variables:
> $ Sex                        : Factor w/ 2 levels "0","1": 2 2 2 2 2 2 2 
> 2 2 2 ...
> $ InfSpath_caducuous         : Factor w/ 3 levels "0","1","NA": 1 1 1 1 
> 1 1 1 1 1 1 ...
> $ InfType_sparsely_paniculate: Factor w/ 3 levels "0","1","NA": 1 1 1 3 
> 1 1 1 1 1 1 ...
> 
> But there are still NA`s in the data frame. Sorry if this reason is only 
> ma stupididty and thanks for answering in advance.
> 
> B.
> 
> 
> Birgit Lemcke
> Institut für Systematische Botanik
> Zollikerstrasse 107
> CH-8008 Zürich
> Switzerland
> Ph: +41 (0)44 634 8351
> birgit.lemcke at systbot.uzh.ch
> 
> 175 Jahre UZH
> «staunen.erleben.begreifen. Naturwissenschaft zum Anfassen.»
> MNF-Jubiläumsevent für gross und klein.
> 19. April 2008, 10.00 Uhr bis 02.00 Uhr
> Campus Irchel, Winterthurerstrasse 190, 8057 Zürich
> Weitere Informationen http://www.175jahre.uzh.ch/naturwissenschaft



More information about the R-help mailing list