[BioC] pamr Error: each class must have >1 sample

Kasper Daniel Hansen k.hansen at biostat.ku.dk
Wed Jul 28 22:17:17 CEST 2004


Dick Beyer <dbeyer at u.washington.edu> writes:

> I am having trouble with pamr.train and subsequently pamr.cv.
>
> In the pamr documentation, the following works:
>
>      set.seed(120)
>      x <- matrix(rnorm(1000*20),ncol=20)
>      y <- sample(c(1:4),size=20,replace=TRUE)
>      mydata <- list(x=x,y=y)
>      mytrain <-   pamr.train(mydata)
>      mycv <- pamr.cv(mytrain,mydata)
>
> But if you change the seed, it doesn't:
>
>      set.seed(1123)
>      x <- matrix(rnorm(1000*20),ncol=20)
>      y <- sample(c(1:4),size=20,replace=TRUE)
>      mydata <- list(x=x,y=y)
>      mytrain <-   pamr.train(mydata)
> Error in nsc(data$x[gene.subset, sample.subset], y = y, proby = proby,  : 
>         Error: each class must have >1 sample
>
> There is discussion in the documents (http://www-stat.stanford.edu/~tibs/PAM/Rdist/doc/readme.html) about "fragile" functions, but I have not been able to understand how to make this error go away.  If anyone has had this problem or has some advice, I would be eternally grateful.

If you look at the y-ector you will notice it look like this
> table(y)
y
1 2 3 4
1 6 5 8

Hence there is only 1 sample with a class of "1". Of course this
happens when you sample 20 times from a set of 4 values. From the error
message it seems that the method requires at least two samples from
every class. 

Possible solutions (quick solutions, I am not to familiar with pamr):
- increase the size, so that a class with only one sample is very
unlikely.
- fit the data, disregarding the single sample and using only 3
classes

/Kasper

-- 
Kasper Daniel Hansen, Research Assistant
Department of Biostatistics, University of Copenhagen



More information about the Bioconductor mailing list