[BioC] Best way to remove noise from dataset.

Arnar Flatberg arnar.flatberg at gmail.com
Fri Jan 10 15:37:27 CET 2014


Hi Chris,

On Fri, Jan 10, 2014 at 12:59 PM, Fenton Christopher Graham
<christopher.fenton at uit.no> wrote:
> Looking at the data I can see via princomp that pc1 is mostly noise, pc2 and pc3 are much more relevant.
> What is the best way to remove the effects of pc1 from the dataset.
>

If it's really, really crap,  ;.-) :
just subtract it:

mod <- prcomp(data)
new.data <- data - mod$x[,1] %*% t(mod$rotation[,1])

If there may be some interesting variance in PC1, e.g it may have some
variance related to your targets/reponse values (if you have some),
then look into the surrogate variable package (sva) and related
material.


Arnar



More information about the Bioconductor mailing list