[BioC] Understanding the columns in the limma results output

Wed Aug 29 22:18:50 CEST 2012

Hi Jorge,

On 8/29/2012 4:15 PM, Jorge Miró wrote:
> Hi,
>
> My bad too :) What I really meant with fold change was the fold ratio 
> change between the groups. I though that was what was in topTable but 
> now I think I get it. The fold change that is printed in limma is the 
> difference between the means in log2 and not the ratio of the mean of 
> the gene expressions in their linear form.
>
> The data I'm analysing is from RMA and from PLIER so I have linear 
> values for the gene expressions summarized with PLIER. Should I 
> log2-transform them before passing them to the functions in limma?

Yes.

>
> Also, if I now wanted to calculated the ratios between the RMA gene 
> expressions, do you recommend that I transform the expressions to 
> linear values before calculating the ratio or is it OK to take the 
> ratio in log2?

You don't take the ratio, you take the difference. Remember that 
log(x/y) = log(x) - log(y).

Best,

Jim

>
> Best regards
> Jorge
>
> On Wed, Aug 29, 2012 at 4:56 PM, James W. MacDonald <jmacdon at uw.edu 
> <mailto:jmacdon at uw.edu>> wrote:
>
>     Hi Jorge,
>
>     My bad. The coefficients for write.fit() are the fold changes, and
>     you do get a column for each. As an example, with fake data.
>
>     > library(limma)
>     > mat <- matrix(rnorm(12e5), ncol = 12)
>     > design <- model.matrix(~0+factor(rep(1:4, each = 3)))
>     > contrast <- matrix(c(1,-1,0,0,1,0,-1,0,1,0,0,-1), ncol = 3)
>     > fit2 <- contrasts.fit(fit, contrast)
>     > fit2 <- eBayes(fit2)
>     > rslt <- decideTests(fit2)
>     > write.fit(fit2, rslt, "tmp.txt")
>     > dat <- read.table("tmp.txt", header=T)
>     > head(dat)
>       Coef.1 Coef.2 Coef.3   t.1   t.2   t.3 p.value.1 p.value.2
>     p.value.3    F
>     1 -0.837 -0.352 -0.430 -1.02 -0.43 -0.53   0.30621   0.66699  
>     0.59932 0.35
>     2 -1.200 -0.599  0.524 -1.47 -0.73  0.64   0.14148   0.46259  
>     0.52034 1.67
>     3 -1.014 -1.533 -1.515 -1.24 -1.87 -1.85   0.21627   0.06176  
>     0.06492 1.54
>     4  0.520  0.308  1.059  0.64  0.38  1.30   0.52390   0.70573  
>     0.19449 0.60
>     5 -0.848  0.269 -0.611 -1.04  0.33 -0.75   0.29880   0.74132  
>     0.45410 0.81
>     6 -0.094 -0.245 -0.474 -0.11 -0.30 -0.58   0.90847   0.76396  
>     0.56088 0.13
>       F.p.value Res.1 Res.2 Res.3
>     1   0.78697     0     0     0
>     2   0.17130     0     0     0
>     3   0.20359     0     0     0
>     4   0.61671     0     0     0
>     5   0.48698     0     0     0
>     6   0.94300     0     0     0
>
>     > nrow(mat)
>     [1] 100000
>     > nrow(dat)
>     [1] 100000
>
>     An alternative you might consider is the writeFit() function from
>     my affycoretools package, which outputs the data in a slightly
>     different format.
>
>     Best,
>
>     Jim
>
>
>
>
>     On 8/29/2012 2:55 AM, Jorge Miró wrote:
>
>         Hi James,
>
>         thanks for the explanation. I do not really understand the
>         columns yet. Shouldn't the FC be printed for every comparison
>         as is done for the Coef-columns? I just get one A-column.
>         Is there any way of printing the results to a file with the
>         same columns as in topTable()?
>         What I'm really want to have in the output is a p-value
>         column, an FC column and an adjusted p-value column.
>
>         Best regards
>
>
>         On Tue, Aug 28, 2012 at 5:41 PM, James W. MacDonald
>         <jmacdon at uw.edu <mailto:jmacdon at uw.edu> <mailto:jmacdon at uw.edu
>         <mailto:jmacdon at uw.edu>>> wrote:
>
>             Hi Jorge,
>
>
>             On 8/28/2012 11:20 AM, Jorge Miró wrote:
>
>                 Hi,
>
>                 I have run the commands below to get an analysis of
>         differential
>                 expressions in my Affymetrix arrays
>
>                 #Prepare the design and contrast matrices for my
>         comparisons
>                 of the three
>                 groups in a loop-manner.
>
>                     design<- model.matrix(~0+factor(c(1,1,1,2,2,2,3,3,3)))
>                     colnames(design)<- c('GroupA', 'GroupB', 'GroupC')
>                     contrast.matrix<- makeContrasts(GroupB-GroupA,
>         GroupC-GroupA,
>
>                 GroupC-GroupB, levels=design)
>
>                 #Check design and contrast matrices
>
>                     design
>
>                    GroupA GroupB GroupC
>                 1      1      0      0
>                 2      1      0      0
>                 3      1      0      0
>                 4      0      1      0
>                 5      0      1      0
>                 6      0      1      0
>                 7      0      0      1
>                 8      0      0      1
>                 9      0      0      1
>                 attr(,"assign")
>                 [1] 1 1 1
>                 attr(,"contrasts")
>                 attr(,"contrasts")$`factor(c(1, 1, 1, 2, 2, 2, 3, 3, 3))`
>                 [1] "contr.treatment"
>
>                     contrast.matrix
>
>                          Contrasts
>                 Levels   GroupB - GroupA GroupC - GroupA GroupC - GroupB
>                    GroupA              -1              -1               0
>                    GroupB               1               0              -1
>                    GroupC               0               1               1
>
>                 #Fitting the eset to to the design and contrast
>
>                     fit<- lmFit(exprs, design)
>                     fit2<- contrasts.fit(fit, contrast.matrix)
>
>                 #Computing the statistics
>
>                     fit2<- eBayes(fit2)
>
>
>                 Then I check the results with topTable and get the
>         following
>                 columns in the
>                 output
>
>                     topTable(fit2)
>
>                        GroupB...GroupA GroupC...GroupA GroupC...GroupB
>                  AveExpr        F
>                   P.Value adj.P.Val
>                 25031       2.3602203       2.4273830      0.06716267
>         5.021412
>                 29.06509
>                 7.844834e-05 0.9587773
>                 12902      -0.4572897      -0.5680943     -0.11080467
>         7.516681
>                 25.41608
>                 1.365021e-04 0.9587773
>                 7158       -0.4478660      -0.4296077      0.01825833
>         7.057833
>                 23.48871
>                 1.880100e-04 0.9587773
>                 18358      -0.1002647       0.3304903      0.43075500
>         7.352807
>                 22.78417
>                 2.125096e-04 0.9587773
>                 28768      -0.7695883      -1.3837750     -0.61418667
>         3.983044
>                 22.47514
>                 2.244612e-04 0.9587773
>                 28820      -0.1708800      -0.9939680     -0.82308800
>         5.470826
>                 18.25071
>                 5.081473e-04 0.9587773
>                 15238      -0.4850297      -0.4658157      0.01921400
>         7.071662
>                 17.15191
>                 6.440979e-04 0.9587773
>                 24681      -0.3759717      -0.3486450      0.02732667
>         9.281578
>                 16.47813
>                 7.493077e-04 0.9587773
>                 19246      -0.8675393      -0.5082140      0.35932533
>         8.123538
>                 16.27776
>                 7.845150e-04 0.9587773
>                 28808       0.2601277       0.6909140      0.43078633
>         4.814602
>                 16.21283
>                 7.963487e-04 0.9587773
>
>                 I want to export my results and write
>
>                     results<- decideTests(fit2)
>                     write.fit(fit2, results, "limma_results.txt",
>         adjust="BH")
>
>                 Now don't get the same columns as when using topTable
>         which is
>                 quite
>                 confusing. Why don't I get the FC for the comparisons
>         between
>                 the different
>                 groups as if I run topTable with the coef parameter (
>                 "topTable(fit2,
>                 coef=1)" )? The columns I get are the following
>
>
>             The simple answer is that they are two different functions
>         with
>             different goals. But note that you do get the same
>         information.
>
>
>
>                 A
>
>                 Coef.GroupB - GroupA
>                 Coef.GroupC - GroupA
>                 Coef.GroupC - GroupB
>
>                 t.GroupB - GroupA
>                 t.GroupC - GroupA
>                 t.GroupC - GroupB
>
>                 p.value.GroupB - GroupA
>                 p.value.GroupC - GroupA
>                 p.value.GroupC - GroupB
>
>                 p.value.adj.GroupB - GroupA
>                 p.value.adj.GroupC - GroupA
>                 p.value.adj.GroupC - GroupB
>
>                 F
>                 F.p.value
>
>                 Res.GroupB - GroupA
>                 Res.GroupC - GroupA
>                 Res.GroupC - GroupB
>
>
>                 Could some body please try to explain what do the
>         columns A,
>                 Coef, F,
>                 F.p.value and Res mean?
>
>
>             A - are your log fold change values
>             Coef - are your coefficients (you set up a cell means
>         model, so
>             these are the sample means)
>             F - is an F-statistic, which tests the null hypothesis
>         that none
>             of the sample means are different
>             F.p.value - is the p-value for the F-statistic
>             Res - is the results matrix you passed into write.fit(),
>         showing
>             which contrast(s) were significant
>
>             Best,
>
>             Jim
>
>
>
>
>
>                 #Session info
>
>                     sessionInfo()
>
>                 R version 2.15.0 (2012-03-30)
>                 Platform: i386-pc-mingw32/i386 (32-bit)
>
>                 locale:
>                 [1] LC_COLLATE=Swedish_Sweden.1252
>          LC_CTYPE=Swedish_Sweden.1252
>                   LC_MONETARY=Swedish_Sweden.1252 LC_NUMERIC=C
>                   LC_TIME=Swedish_Sweden.1252
>
>                 attached base packages:
>                 [1] stats     graphics  grDevices utils     datasets
>          methods
>                   base
>
>                 other attached packages:
>                 [1] limma_3.12.1       Biobase_2.16.0    
>         BiocGenerics_0.2.0
>
>                 loaded via a namespace (and not attached):
>                 [1] affylmGUI_1.30.0      IRanges_1.14.4              
>         oneChannelGUI_1.22.10
>                 stats4_2.15.0         tcltk_2.15.0
>                 Best regards
>                 JMA
>
>                         [[alternative HTML version deleted]]
>
>                 _______________________________________________
>                 Bioconductor mailing list
>         Bioconductor at r-project.org <mailto:Bioconductor at r-project.org>
>         <mailto:Bioconductor at r-project.org
>         <mailto:Bioconductor at r-project.org>>
>
>         https://stat.ethz.ch/mailman/listinfo/bioconductor
>                 Search the archives:
>         http://news.gmane.org/gmane.science.biology.informatics.conductor
>
>
>             --     James W. MacDonald, M.S.
>             Biostatistician
>             University of Washington
>             Environmental and Occupational Health Sciences
>             4225 Roosevelt Way NE, # 100
>             Seattle WA 98105-6099
>
>
>
>     -- 
>     James W. MacDonald, M.S.
>     Biostatistician
>     University of Washington
>     Environmental and Occupational Health Sciences
>     4225 Roosevelt Way NE, # 100
>     Seattle WA 98105-6099
>
>

-- 
James W. MacDonald, M.S.
Biostatistician
University of Washington
Environmental and Occupational Health Sciences
4225 Roosevelt Way NE, # 100
Seattle WA 98105-6099