[BioC] Knn and missing value

Federico Abascal fabascal at cnb.uam.es
Mon Sep 24 12:47:28 CEST 2007


Dear Claudio,

Here is a possible solution.

Assign NAs randomly (but with some sense) and classify the samples.
Repeat this "NAs-assignment + classification" procedure several times
and compare the results. If they are are similar, I would say that there
is no risk of overfitting if you assign NAs with knnimpute. If they are
not similar, the question remains open.

With respect to how to implement all this in R, I cannot help you very much.

Best,
Federico



claudio.is at libero.it wrote:

> Dear Bioc,
>
> I want to use the Knn function, from packages(class), to classify sample from a cDNA dataset in which there are some missing values. 
> When I run the script, the function complains about missing values, so I checked for some strategies to overcome the problem. 
> One way could be to fit the missing values on the data itself, with pamr.knnimpute, but I don't like the idea, as I maight overfit the data itself. 
> On the other side I was looking for a strategy to make knn accept missing values but I could not find. 
> do you have any suggestion?
>
> --
> Claudio
>
>
> ------------------------------------------------------
> Leggi GRATIS le tue mail con il telefonino i-mode™ di Wind
> http://i-mode.wind.it/
>
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>
>



More information about the Bioconductor mailing list