[BioC] Multi-factor multi-level analysis of RNAseq data

Alessandro Botton [guest] guest at bioconductor.org
Thu Jun 13 14:32:19 CEST 2013

Hi all.
This is my first post here. I wasn't sure of posting my question to this mailing list, as my competence in statistics is very poor. So, please, forgive me in advance for what I could say...
A few words to describe my experimental design. I have RNAseq data from 54 samples. My experiment deals with apple fruits collected at two different times (H=at harvest, and PH=after postharvest storage) from trees of three different varieties (G, F, and P) grown under three different agronomic conditions (L, M, and H). I have done 3 biological replicates. So 2 times x 3 varieties x 3 conditions x 3 replicates = 54.
Based upon the suggestions of some colleagues of mine, I have decided to start using EdgeR for the analysis of these data, as they told me (without knowing how!!!) that this package implements multifactor-multilevel pipelines.
My first objective is to achieve a list of differentially expressed genes that are affected:
1) only by the genotype (variety)
2) only by the agronomic condition
3) only by storage
or by an interaction of
4) genotype x agr. condition
5) genotype x storage
6) agr. condition x storage.
I could not be interested by the triple interaction.
I have read the EdgeR vignette and case studies. I have searched the internet but found just discussions in which the "statistical level" was too high for me.
Would you please give me some code examples to get these lists?
The first problem may be the setting up of the design matrix... May I use the same scheme of the paragraph 3.5 of the EdgeR vignette (Comparisons Both Between and Within Subjects)?
Thanks in advance to whom will reply and sorry for this "low level" question.

 -- output of sessionInfo(): 

R version 2.15.3 (2013-03-01)
Platform: i386-apple-darwin9.8.0/i386 (32-bit)

[1] it_IT.UTF-8/it_IT.UTF-8/it_IT.UTF-8/C/it_IT.UTF-8/it_IT.UTF-8

attached base packages:
[1] grid      stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] gplots_2.11.0      MASS_7.3-23        KernSmooth_2.23-8  caTools_1.14       gdata_2.12.0       gtools_2.7.1       RColorBrewer_1.0-5 edgeR_3.0.8        limma_3.14.4       DESeq_1.10.1       lattice_0.20-13    locfit_1.5-9      
[13] Biobase_2.18.0     BiocGenerics_0.4.0

loaded via a namespace (and not attached):
 [1] annotate_1.36.0      AnnotationDbi_1.20.7 bitops_1.0-4.2       DBI_0.2-5            genefilter_1.40.0    geneplotter_1.36.0   IRanges_1.16.6       parallel_2.15.3      RSQLite_0.11.2       splines_2.15.3       stats4_2.15.3       
[12] survival_2.37-2      tools_2.15.3         XML_3.96-1.1         xtable_1.7-1       

Sent via the guest posting facility at bioconductor.org.

More information about the Bioconductor mailing list