[BioC] VSN: minimum number of controls?

Martin Morgan mtmorgan at fhcrc.org
Sun Apr 4 03:10:08 CEST 2010


On 04/03/2010 05:39 PM, Eric E. Snyder wrote:
> Martin Morgan wrote:
>> On 04/02/2010 02:41 PM, Eric E. Snyder wrote:
>>> In my first project with R and BioConductor, I am analyzing some small
>>> microarrays, starting with variance normalization with vsn.  Using
>>> Wolfgang Huber's VSN.pdf tutorial I was able to do the exercise with the
>>> "kidney" dataset without trouble.  However, when trying to run:
>>>
>>>>  fit = vsn2( noDNAcontrols )
>>> Error in .local(x, reference, strata, ...) :
>>>   One or more of the strata contain less than 42 elements.
>>> Please reduce the number of strata so that there is enough in each stratum.
>>
>> Always good to provide sessionInfo() so that we know the details of the
>> software you're using
> 
> Okay, my sessionInfo:
> 
>> sessionInfo()
> R version 2.10.1 (2009-12-14)
> x86_64-unknown-linux-gnu
> 
> locale:
> [1] C
> 
> attached base packages:
> [1] stats     graphics  grDevices utils     datasets  methods   base
> 
> other attached packages:
> [1] vsn_3.14.0    Biobase_2.6.1
> 
> loaded via a namespace (and not attached):
> [1] affy_1.24.2          affyio_1.14.0        grid_2.10.1
> [4] lattice_0.17-26      limma_3.2.1          preprocessCore_1.8.0
> 
>> and then good to try for a reproducible example, or at least enough info
>> for other to reproduce your error. I started with example(vsn2) and then
>>
>>> vsn2(kidney[1:20,])
>> Error in vsnMatrix(exprs(x), reference, strata, ...) :
>>   One or more of the strata contain less than 42 elements.
>> Please reduce the number of strata so that there is enough in each stratum.
>>
>> My guess is that noDNAcontrols is a matrix-like object with rows and
>> columns transposed, i.e., samples x features rather than features x
>> samples. What is class(noDNAcontrols) and dim(noDNAcontrols) ? Might as
>> well copy and paste the output directly from R
> 
>> dim(noDNAcontrols)
> [1]   6 853
> 
>> noDNAcontrols
>           X1   X2   X3   X4 ... X853
> no_DNA4 9840 5193 4854 6466 ... 6121
> no_DNA5 5244 3569 3419 4587 ... 3595
> no_DNA6 4630 3271 2877 5270 ... 2729
> no_DNA3 4403 3782 3368 6004 ... 1557
> no_DNA1 3745 4984 2842 6701 ...  783
> no_DNA2 2099 4230 3165 6777 ...  756
> 
> [ellipsis provided by me and vi]
> 
> This is the data that vsn2() gags on.  Since the vsn2( kidney[1:20,] )
> example also fails, it looks like vsn2() has a pretty strict requirement
> for minimum sample size.  If so, why is that and is there any way around it?
> 
> I hope I have supplied enough information to work on now; it you need
> anything else. please ask.

I think you have 6 samples and 853 features; you want to transpose your
data, so that you have 853 features and 6 samples. If noDNAcontrols is a
matrix, then

  vsn2(t(noDNAcontrols))

! Also, but secondary to getting your data oriented correctly, vsn2 has
an argument described on the help page

  ?vns2

minDataPointsPerStratum which is dictating how many rows (i.e.,
features) are used to describe each stratum. The vignette

  browseVignettes('vsn')

describes what stratum is meant to refer to.

Martin

> Many thanks!
> 
> As for my second question concerning the behavior of rnorm(), I should
> probably simplify matters and resubmit it under a separate subject line.
> 
> Cheers,
> eesnyder


-- 
Martin Morgan
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M1 B861
Phone: (206) 667-2793



More information about the Bioconductor mailing list