[BioC] EdgeR norm.factor input

Gordon K Smyth smyth at wehi.EDU.AU
Tue Feb 11 02:29:27 CET 2014


edgeR always takes the total read count into account, so

   norm.factors = 1

is equivalent to total read count normalization.

Please read the section on normalization in the edgeR User's Guide.

Best wishes
Gordon


> Date: Mon, 10 Feb 2014 11:06:31 -0800 (PST)
> From: "Yanzhu [guest]" <guest at bioconductor.org>
> To: bioconductor at r-project.org, mlinyzh at gmail.com
> Subject: [BioC] EdgeR norm.factor input
>
>
> Dear Gordon,
>
> Thank you so much for your comments.
>
> One more question about the first question asked in my previous post 
> where I asked about how to supply the correct factor in the 
> normalization step.
>
> I would like use the total read count normalization method to normalize 
> the data then use the edgeR to test my multi-factor models as in my 
> previous post. The total read count normalization is given as
>
> X_ij/(N_j/mean(N))=X_ij*mean(N)/N_j,
>
> where X_ij is the read count of gene i sample j, N_j is the library size 
> of sample j, and mean(N) is the mean of library sizes over all samples. 
> My question is what is the input for y$samples$norm.factors? Can I do as 
> the following: y$samples$norm.factors = N/mean(N)? Where N is the vector 
> of library size of all samples, and mean(N) is the mean of library sizes 
> over all sample. Or could you please give me some suggestion? Thank you!
>
>
>
> Yanzhu
>
> ---------------------------------------------------
>
> Date: Fri,  7 Feb 2014 07:25:17 -0800 (PST)
>> From: "Yanzhu [guest]" <guest at bioconductor.org>
>> To: bioconductor at r-project.org, mlinyzh at gmail.com
>> Subject: [BioC] EdgeR multi-factor testing questions
>>
>>
>> Dear Gordon,
>>
>> Thank you so much for your comments. I have updated my code and get the
>> different results for TMM and Upper quartile normalization methods.
>>
>> I have two more question regarding the normalization issue. I have tried
>> different normalization methods and would like to compare their
>> performance. My questions are:
>>
>> 1. In the users' guide 2.5.6, it mentions that normalization takes the
>> form of correction factors that enter into the statistical model. Such
>> correction factors are usually computed internally by edgeR functions,
>> but it is also possible for a user to supply them.I would like to supply
>> the correct factor to edgeR, how could I do this?
>
> Just enter in your own values:
>
>  y$samples$norm.factors <- yourvalues
>
>> 2. I also would like to compare the testing results of normalized data
>> with the results of raw data (without normalizing the data)? Could I
>> just skip the the normalization step as below?
>
> Yes.
>
> Gordon
>
>> group<-paste(L,S,R,sep=".")
>> design<-model.matrix(~L+R+S+L:R+L:S+R:S+L:R:S)
>> y<-DGEList(counts=counts,group=group)
>> #y<-calcNormFactors(y,method="upperquartile",p=0.75) ##skip this step
>>
>> y<-estimateGLMCommonDisp(y,design)
>> y<-estimateGLMTagwiseDisp(y,design)
>>
>> fiteUQ_LRS<-glmFit(y,design,offset=offset  )
>>
>> Thanks.
>>
>>
>> Yanzhu
>>
>>
>
>
> -- output of sessionInfo():
>
>>  sessionInfo()
> R version 3.0.1 (2013-05-16)
> Platform: x86_64-w64-mingw32/x64 (64-bit)
>
> locale:
> [1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252    LC_MONETARY=English_United States.1252
> [4] LC_NUMERIC=C                           LC_TIME=English_United States.1252
>
> attached base packages:
> [1] parallel  stats     graphics  grDevices utils     datasets  methods   base
>
> other attached packages:
> [1] DESeq_1.12.1       lattice_0.20-15    locfit_1.5-9.1     Biobase_2.20.1     BiocGenerics_0.6.0 edgeR_3.2.4        limma_3.16.8
>
> loaded via a namespace (and not attached):
> [1] annotate_1.38.0      AnnotationDbi_1.22.6 DBI_0.2-7            genefilter_1.42.0    geneplotter_1.38.0   grid_3.0.1           IRanges_1.18.4
> [8] RColorBrewer_1.0-5   RSQLite_0.11.4       splines_3.0.1        stats4_3.0.1         survival_2.37-4      XML_3.98-1.1         xtable_1.7-1

______________________________________________________________________
The information in this email is confidential and intend...{{dropped:4}}



More information about the Bioconductor mailing list