[R] randomForest and missing data

Torsten Hothorn Torsten.Hothorn at rzmail.uni-erlangen.de
Wed Jan 10 10:57:27 CET 2007


On Tue, 9 Jan 2007, Bálint Czúcz wrote:

> There is an improved version of the original random forest algorithm
> available in the "party" package (you can find some additional
> information on the details here:
> http://www.stat.uni-muenchen.de/sfb386/papers/dsp/paper490.pdf ).
>
> I do not know whether it yields a solution to your problem about
> missing data, but maybe it's a check worth...
>

yes, `cforest()' is able to deal with missing values. More specifically, 
the implementation is based on conditional trees (`ctree()') which are 
able to set up surrogate splits.

Torsten

> Best regards:
>
> Bálint
>
> On 1/4/07, Darin A. England <england at cs.umn.edu> wrote:
>>
>> Does anyone know a reason why, in principle, a call to randomForest
>> cannot accept a data frame with missing predictor values? If each
>> individual tree is built using CART, then it seems like this
>> should be possible. (I understand that one may impute missing values
>> using rfImpute or some other method, but I would like to avoid doing
>> that.)
>>
>> If this functionality were available, then when the trees are being
>> constructed and when subsequent data are put through the forest, one
>> would also specify an argument for the use of surrogate rules, just
>> like in rpart.
>>
>> I realize this question is very specific to randomForest, as opposed
>> to R in general, but any comments are appreciated. I suppose I am
>> looking for someone to say "It's not appropriate, and here's why
>> ..." or "Good idea. Please implement and post your code."
>>
>> Thanks,
>>
>> Darin England, Senior Scientist
>> Ingenix
>>
>> ______________________________________________
>> R-help at stat.math.ethz.ch mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>


More information about the R-help mailing list