[BioC] predFC doesn't use norm factors?

Gordon K Smyth smyth at wehi.EDU.AU
Sat Aug 31 01:51:33 CEST 2013


Dear Jenny,

On Fri, 30 Aug 2013, Zadeh, Jenny Drnevich wrote:

> Hi Gordon,
>
> I used to use predFC() to get modified log count-per-million values per 
> sample but now I'm switching to cpm().

That's good.  When we introduced cpm() a couple of years ago, we intended 
it to take over this role.

> I just realized that predFC() doesn't use the normalization factors in 
> the DGEList object when design=NULL, but it does appear to use them when 
> the design is specified (see example below) and there is no argument to 
> specify them, unlike cpm().  Is this a bug or the intended behavior of 
> predFC()?

It is not how I want it to work.  Really it is a carry over from old 
behavior that has been kept for historical reasons and backward 
compatibility.  Computing cpm was never the main purpose the predFC(), 
although it was used to do that before cpm() existed as a separate 
function.  Our first implementation of cpm() did not use normalization 
factors.

I am going to deprecate this behaviour entirely and return predFC() 
exclusively to its main purpose.  For the next release cycle, predFC() 
will give a warning message when it gets a NULL design, asking users to 
switch to cpm().  In the longer term future, predFC() will treat NULL 
design matrices in the same way that glmFit() does.

Best wishes
Gordon

> Thanks,
> Jenny
>
>> library(edgeR)
> Loading required package: limma
>>
>> # generate counts for a two group experiment with n=2 in each group and 100 genes
>> dispersion <- 0.1
>> y1 <- matrix(rnbinom(400,size=1/dispersion,mu=4),nrow=100)
>> y1 <- DGEList(y1,group=c(1,1,2,2))
>> design <- model.matrix(~group, data=y1$samples)
>>
>> y2 <- y1
>> y2$samples$norm.factors <- c(0.9,0.9,1.1,1.1)
>>
>>
>> #estimate the predictive log fold changes
>>
>> predlfc1 <- predFC(y1,design,dispersion=dispersion,prior.count=1)
>> predlfc2 <- predFC(y2,design,dispersion=dispersion,prior.count=1)
>>
>> all.equal(predlfc1,predlfc2)
> [1] "Mean relative difference: 0.04869379"
>>
>>
>> predlfc3 <- predFC(y1,dispersion=dispersion,prior.count=1)
>> predlfc4 <- predFC(y2,dispersion=dispersion,prior.count=1)
>>
>> all.equal(predlfc3,predlfc4)
> [1] TRUE
>>
>> cpm1 <- cpm(y1,log=T,prior.count=1)
>> cpm2 <- cpm(y2,log=T,prior.count=1)
>>
>> all.equal(cpm1,cpm2)
> [1] "Mean relative difference: 0.007799144"
>>
>> all.equal(cpm1,predlfc3)
> [1] TRUE
>> sessionInfo()
> R version 3.0.1 (2013-05-16)
> Platform: x86_64-w64-mingw32/x64 (64-bit)
>
> locale:
> [1] LC_COLLATE=English_United States.1252
> [2] LC_CTYPE=English_United States.1252
> [3] LC_MONETARY=English_United States.1252
> [4] LC_NUMERIC=C
> [5] LC_TIME=English_United States.1252
>
> attached base packages:
> [1] stats     graphics  grDevices utils     datasets  methods   base
>
> other attached packages:
> [1] edgeR_3.2.4  limma_3.16.7
>>
>
>
> Jenny Drnevich, Ph.D.
>
> Functional Genomics Bioinformatics Specialist
> High Performance Biological Computing Program
> and The Roy J. Carver Biotechnology Center
> University of Illinois, Urbana-Champaign
>
> NOTE NEW OFFICE LOCATION
> 2112 IGB
> 1206 W. Gregory Dr.
> Urbana, IL 61801
> USA
> ph: 217-300-6543
> fax: 217-265-5066
> e-mail: drnevich at illinois.edu
>
>

______________________________________________________________________
The information in this email is confidential and intend...{{dropped:4}}



More information about the Bioconductor mailing list