[BioC] edgeR: glmLRT test

James W. MacDonald jmacdon at uw.edu
Fri Mar 22 16:28:26 CET 2013


Hi KJ,

On 3/22/2013 11:17 AM, KJ Lim wrote:
> Dear Jim,
>
> Thanks for your replied.
>
> I used read.delim to read my data into R session. The target object is:
>
>    files        phenotype   area
> 1 H3TZ.txt   HS           TZ
> 2 H3SW.txt  HS           SW
> 3 H4TZ.txt   HS           TZ
> 4 H4SW.txt  HS           SW
> 5 L2TZ.txt    LS           TZ
> 6 L2SW.txt   LS           SW
> 7 L3TZ.txt    LS           TZ
> 8 L3SW.txt   LS           SW
>
> I built the factors as:
>
> > phenotype <- factor(targets$phenotype)
> > area <- factor(ts.targets$area, levels=c("TZ","SW"))
>
> I want to know do the phenotypes (HS & LS) have effect on the area. I 
> built the design matrix in each way and I found the colnames is not as 
> you have suggested.

Right. But that isn't the point I was trying to make (that the colnames 
would be something in particular). Instead, I was pointing out that the 
coefficients being estimated will be the same regardless, and the only 
difference is the order in the design matrix.

>
> > colnames(model.matrix(~phenotype+area))
>   [1] "(Intercept)" "phenotypeL"  "areaTZ"
> > colnames(model.matrix(~area+phenotype))
>   [1] "(Intercept)" "areaTZ"      "phenotypeL"
>
> Perhaps, I made mistakes in between?

No, you illustrated my point as well. The two coefficients of note here 
are areaTZ and phenotypeL, and the only difference between the design 
matrices is the order that they appear.

But you make a further statement above 'I want to know do the phenotypes 
(HS & LS) have effect on the area.'. This looks to me like you want the 
interaction term, which captures the difference in phenotype, given the 
area (or conversely, the differences in area, given phenotype).

I believe there is at least one example of an interaction term in the 
limma User's Guide that you can look at to gain understanding.

Best,

Jim


>
> Thanks for your time. Have a nice weekend.
>
> Best regards,
> KJ Lim
>
> On 22 March 2013 16:17, James W. MacDonald <jmacdon at uw.edu 
> <mailto:jmacdon at uw.edu>> wrote:
>
>     Hi KJ,
>
>     The short answer is that it doesn't matter, contingent upon you
>     having set up the experiment correctly. Have you tried
>     constructing the design matrix each way?
>
>     > phenotype <- factor(rep(1:2, each=8))
>     > area <- factor(rep(1:2, times = 8))
>     > colnames(model.matrix(~area+phenotype))
>     [1] "(Intercept)" "area2"       "phenotype2"
>     > colnames(model.matrix(~phenotype+area))
>     [1] "(Intercept)" "phenotype2"  "area2"
>
>     The interpretation of each coefficient will be the same in both
>     instances.
>
>     Best,
>
>     Jim
>
>
>
>
>     On 3/22/2013 9:57 AM, KJ Lim wrote:
>
>         Dear edgeR community,
>
>         Good day.
>
>         I have sets of RNA-Seq data of 2 phenotype (HighS, LowS)
>         plants with area
>         TZ&  SW (control).  I used exactTest to study which genes that
>         are
>
>         differential expressed in area TZ compare to SW of each phenotype.
>
>         I would like to know do the phenotypes have effect on the
>         area, in this
>         case I should use glmLRT test. Before fit into the GLM, I
>         defined my design
>         matrix as:
>
>            ~phenotype+area
>
>         Could someone please advice me regarding the design matrix?
>         Should I define
>         my design matrix as  ~area+phenotype ?
>
>         Thank you very much for your time and help.
>
>         Have a nice weekend.
>
>         Best regards,
>         KJ Lim
>
>                 [[alternative HTML version deleted]]
>
>         _______________________________________________
>         Bioconductor mailing list
>         Bioconductor at r-project.org <mailto:Bioconductor at r-project.org>
>         https://stat.ethz.ch/mailman/listinfo/bioconductor
>         Search the archives:
>         http://news.gmane.org/gmane.science.biology.informatics.conductor
>
>
>     -- 
>     James W. MacDonald, M.S.
>     Biostatistician
>     University of Washington
>     Environmental and Occupational Health Sciences
>     4225 Roosevelt Way NE, # 100
>     Seattle WA 98105-6099
>
>

-- 
James W. MacDonald, M.S.
Biostatistician
University of Washington
Environmental and Occupational Health Sciences
4225 Roosevelt Way NE, # 100
Seattle WA 98105-6099



More information about the Bioconductor mailing list