[BioC] GLM in EdgeR

Mon Jun 16 11:39:22 CEST 2014

Good morning all,

I am just starting to analyse my first set of RNA-seq results and am trying to use EdgeR to do this. I have spent time reading the vignette and manual and working through it using our data. The design of our experiment is that we have two treatments (drought and control), three genotypes (clones of three trees) and each of the individual plants has four leaves taken from it (each of which represents a different developmental stage of the plant). There are between four and five replicates for each genotype-treatment combination. The code we have used so far is:

> x <- read.delim("Poplar.counts.matrix",row.names="Symbol")
> targets <- read.delim(file = "targets.txt", stringsAsFactors = FALSE)
> Treatment <- factor(targets$Trt, levels=c("Con","Dr"))
> Genotype <- factor(targets$Geno, levels=c("France","Italy","Spain"))
> Leaf <- factor(targets$Lf, levels=c("Ap","L11","L7","Tag"))
> data.frame(Treatment,Genotype,Leaf)
> design <-model.matrix(~Genotype+Genotype:Treatment+Genotype:Leaf)

Does this look correct? Using this model, we don't seem to be getting the contrasts we expect (i.e. there is no "Ap" or "France") and I think that this might be because EdgeR is using the first level of the factors of Genotype and Leaf as references. Is this right? Also is there a way to identify if there is an effect of genotype within treatment (i.e. not just compare the genotypes in a pairwise fashion)?

Any help you could give us would be much appreciated.

Many thanks,

Hazel 

Research and Teaching Fellow
Centre for Biological Sciences
University of Southampton
United Kingdom
Email: hazel.smith at soton.ac.uk