[R] Problems using rfImpute

Birgit Lemcke birgit.lemcke at systbot.uzh.ch
Mon May 5 16:07:14 CEST 2008


Thank you James that you saved me from a huge mistake using NA as level.
I specified now

na.strings="NA"

  $ Sex                                    : Factor w/ 2 levels  
"0","1": 2 2 2 2 2 2 2 2 2 2 ...
$ outLatTep_like_other                   : Factor w/ 2 levels  
"0","1": 2 2 2 2 2 2 2 1 2 2 ...
$ outLatTep_like_conduplicate            : Factor w/ 2 levels  
"0","1": 1 1 1 1 1 1 1 2 2 1 ...

Now it looks like that

FemMal85_SexImpute<-rfImpute(Sex~.,data=FemMal85_Sex)
ntree      OOB      1      2
300:  11.93% 11.49% 12.36%

Fehler in randomForest.default(xf, y, ntree = ntree, ..., do.trace =  
ntree,  :
NA not permitted in predictors


What`s going wrong now?

Birgit




But apart from that I get now another error message using rfImpute
Am 05.05.2008 um 15:31 schrieb James Reilly:
>
> The values NA and "NA" are different. The first is treated as  
> missing; the second is not. For example,
> > table(factor(c(NA,"0","1","NA","NA")))
>
>  0  1 NA
>  1  1  2
>
> I suspect you have "NA" where you want NA, and this is causing your  
> problem.
>
> James
> -- 
> James Reilly
> Department of Statistics, University of Auckland
> Private Bag 92019, Auckland, New Zealand
>
> On 6/5/08 1:04 AM, Birgit Lemcke wrote:
>> Hello R-user!
>> I am running R 2.7.0 on a Power Book (Tiger). (I am still R and  
>> statistics beginner)
>> I tried rfImpute (randomForest) and as far as I understood should  
>> it replace NA`s using a proximity matrix:
>>  > set.seed(100000)
>>  > Subset5Imputed<-rfImpute(Sex~., data=Subset5)
>> ntree      OOB      1      2
>> 300:  11.78% 12.36% 11.21%
>> ntree      OOB      1      2
>> 300:  12.07% 12.64% 11.49%
>> ntree      OOB      1      2
>> 300:  11.49% 11.21% 11.78%
>> ntree      OOB      1      2
>> 300:  12.50% 12.93% 12.07%
>> ntree      OOB      1      2
>> 300:  12.07% 12.36% 11.78%
>>  > str(Subset5Imputed)
>> 'data.frame':    696 obs. of  24 variables:
>> $ Sex                        : Factor w/ 2 levels "0","1": 2 2 2 2  
>> 2 2 2 2 2 2 ...
>> $ InfSpath_caducuous         : Factor w/ 3 levels "0","1","NA": 1  
>> 1 1 1 1 1 1 1 1 1 ...
>> $ InfType_sparsely_paniculate: Factor w/ 3 levels "0","1","NA": 1  
>> 1 1 3 1 1 1 1 1 1 ...
>> But there are still NA`s in the data frame. Sorry if this reason  
>> is only ma stupididty and thanks for answering in advance.
>> B.
>> Birgit Lemcke
>> Institut für Systematische Botanik
>> Zollikerstrasse 107
>> CH-8008 Zürich
>> Switzerland
>> Ph: +41 (0)44 634 8351
>> birgit.lemcke at systbot.uzh.ch
>> 175 Jahre UZH
>> «staunen.erleben.begreifen. Naturwissenschaft zum Anfassen.»
>> MNF-Jubiläumsevent für gross und klein.
>> 19. April 2008, 10.00 Uhr bis 02.00 Uhr
>> Campus Irchel, Winterthurerstrasse 190, 8057 Zürich
>> Weitere Informationen http://www.175jahre.uzh.ch/naturwissenschaft
>

Birgit Lemcke
Institut für Systematische Botanik
Zollikerstrasse 107
CH-8008 Zürich
Switzerland
Ph: +41 (0)44 634 8351
birgit.lemcke at systbot.uzh.ch

175 Jahre UZH
«staunen.erleben.begreifen. Naturwissenschaft zum Anfassen.»
MNF-Jubiläumsevent für gross und klein.
19. April 2008, 10.00 Uhr bis 02.00 Uhr
Campus Irchel, Winterthurerstrasse 190, 8057 Zürich
Weitere Informationen http://www.175jahre.uzh.ch/naturwissenschaft



More information about the R-help mailing list