[BioC] Understanding the columns in the limma results output
Jorge Miró
jorgma86 at gmail.com
Wed Aug 29 22:16:14 CEST 2012
Hi,
My bad too :) What I really meant with fold change was the fold ratio
change between the groups. I though that was what was in topTable but
now I think I get it. The fold change that is printed in limma is the
difference between the means in log2 and not the ratio of the mean of
the gene expressions in their linear form.
The data I'm analysing is from RMA and from PLIER so I have linear
values for the gene expressions summarized with PLIER. Should I
log2-transform them before passing them to the functions in limma?
Also, if I now wanted to calculated the ratios between the RMA gene
expressions, do you recommend that I transform the expressions to
linear values before calculating the ratio or is it OK to take the
ratio in log2?
Best regards
Jorge
On Wed, Aug 29, 2012 at 4:56 PM, James W. MacDonald <jmacdon at uw.edu> wrote:
>
> Hi Jorge,
>
> My bad. The coefficients for write.fit() are the fold changes, and you do get a column for each. As an example, with fake data.
>
> > library(limma)
> > mat <- matrix(rnorm(12e5), ncol = 12)
> > design <- model.matrix(~0+factor(rep(1:4, each = 3)))
> > contrast <- matrix(c(1,-1,0,0,1,0,-1,0,1,0,0,-1), ncol = 3)
> > fit2 <- contrasts.fit(fit, contrast)
> > fit2 <- eBayes(fit2)
> > rslt <- decideTests(fit2)
> > write.fit(fit2, rslt, "tmp.txt")
> > dat <- read.table("tmp.txt", header=T)
> > head(dat)
> Coef.1 Coef.2 Coef.3 t.1 t.2 t.3 p.value.1 p.value.2 p.value.3 F
> 1 -0.837 -0.352 -0.430 -1.02 -0.43 -0.53 0.30621 0.66699 0.59932 0.35
> 2 -1.200 -0.599 0.524 -1.47 -0.73 0.64 0.14148 0.46259 0.52034 1.67
> 3 -1.014 -1.533 -1.515 -1.24 -1.87 -1.85 0.21627 0.06176 0.06492 1.54
> 4 0.520 0.308 1.059 0.64 0.38 1.30 0.52390 0.70573 0.19449 0.60
> 5 -0.848 0.269 -0.611 -1.04 0.33 -0.75 0.29880 0.74132 0.45410 0.81
> 6 -0.094 -0.245 -0.474 -0.11 -0.30 -0.58 0.90847 0.76396 0.56088 0.13
> F.p.value Res.1 Res.2 Res.3
> 1 0.78697 0 0 0
> 2 0.17130 0 0 0
> 3 0.20359 0 0 0
> 4 0.61671 0 0 0
> 5 0.48698 0 0 0
> 6 0.94300 0 0 0
>
> > nrow(mat)
> [1] 100000
> > nrow(dat)
> [1] 100000
>
> An alternative you might consider is the writeFit() function from my affycoretools package, which outputs the data in a slightly different format.
>
> Best,
>
> Jim
>
>
>
>
> On 8/29/2012 2:55 AM, Jorge Miró wrote:
>>
>> Hi James,
>>
>> thanks for the explanation. I do not really understand the columns yet. Shouldn't the FC be printed for every comparison as is done for the Coef-columns? I just get one A-column.
>> Is there any way of printing the results to a file with the same columns as in topTable()?
>> What I'm really want to have in the output is a p-value column, an FC column and an adjusted p-value column.
>>
>> Best regards
>>
>>
>> On Tue, Aug 28, 2012 at 5:41 PM, James W. MacDonald <jmacdon at uw.edu <mailto:jmacdon at uw.edu>> wrote:
>>
>> Hi Jorge,
>>
>>
>> On 8/28/2012 11:20 AM, Jorge Miró wrote:
>>
>> Hi,
>>
>> I have run the commands below to get an analysis of differential
>> expressions in my Affymetrix arrays
>>
>> #Prepare the design and contrast matrices for my comparisons
>> of the three
>> groups in a loop-manner.
>>
>> design<- model.matrix(~0+factor(c(1,1,1,2,2,2,3,3,3)))
>> colnames(design)<- c('GroupA', 'GroupB', 'GroupC')
>> contrast.matrix<- makeContrasts(GroupB-GroupA, GroupC-GroupA,
>>
>> GroupC-GroupB, levels=design)
>>
>> #Check design and contrast matrices
>>
>> design
>>
>> GroupA GroupB GroupC
>> 1 1 0 0
>> 2 1 0 0
>> 3 1 0 0
>> 4 0 1 0
>> 5 0 1 0
>> 6 0 1 0
>> 7 0 0 1
>> 8 0 0 1
>> 9 0 0 1
>> attr(,"assign")
>> [1] 1 1 1
>> attr(,"contrasts")
>> attr(,"contrasts")$`factor(c(1, 1, 1, 2, 2, 2, 3, 3, 3))`
>> [1] "contr.treatment"
>>
>> contrast.matrix
>>
>> Contrasts
>> Levels GroupB - GroupA GroupC - GroupA GroupC - GroupB
>> GroupA -1 -1 0
>> GroupB 1 0 -1
>> GroupC 0 1 1
>>
>> #Fitting the eset to to the design and contrast
>>
>> fit<- lmFit(exprs, design)
>> fit2<- contrasts.fit(fit, contrast.matrix)
>>
>> #Computing the statistics
>>
>> fit2<- eBayes(fit2)
>>
>>
>> Then I check the results with topTable and get the following
>> columns in the
>> output
>>
>> topTable(fit2)
>>
>> GroupB...GroupA GroupC...GroupA GroupC...GroupB
>> AveExpr F
>> P.Value adj.P.Val
>> 25031 2.3602203 2.4273830 0.06716267 5.021412
>> 29.06509
>> 7.844834e-05 0.9587773
>> 12902 -0.4572897 -0.5680943 -0.11080467 7.516681
>> 25.41608
>> 1.365021e-04 0.9587773
>> 7158 -0.4478660 -0.4296077 0.01825833 7.057833
>> 23.48871
>> 1.880100e-04 0.9587773
>> 18358 -0.1002647 0.3304903 0.43075500 7.352807
>> 22.78417
>> 2.125096e-04 0.9587773
>> 28768 -0.7695883 -1.3837750 -0.61418667 3.983044
>> 22.47514
>> 2.244612e-04 0.9587773
>> 28820 -0.1708800 -0.9939680 -0.82308800 5.470826
>> 18.25071
>> 5.081473e-04 0.9587773
>> 15238 -0.4850297 -0.4658157 0.01921400 7.071662
>> 17.15191
>> 6.440979e-04 0.9587773
>> 24681 -0.3759717 -0.3486450 0.02732667 9.281578
>> 16.47813
>> 7.493077e-04 0.9587773
>> 19246 -0.8675393 -0.5082140 0.35932533 8.123538
>> 16.27776
>> 7.845150e-04 0.9587773
>> 28808 0.2601277 0.6909140 0.43078633 4.814602
>> 16.21283
>> 7.963487e-04 0.9587773
>>
>> I want to export my results and write
>>
>> results<- decideTests(fit2)
>> write.fit(fit2, results, "limma_results.txt", adjust="BH")
>>
>> Now don't get the same columns as when using topTable which is
>> quite
>> confusing. Why don't I get the FC for the comparisons between
>> the different
>> groups as if I run topTable with the coef parameter (
>> "topTable(fit2,
>> coef=1)" )? The columns I get are the following
>>
>>
>> The simple answer is that they are two different functions with
>> different goals. But note that you do get the same information.
>>
>>
>>
>> A
>>
>> Coef.GroupB - GroupA
>> Coef.GroupC - GroupA
>> Coef.GroupC - GroupB
>>
>> t.GroupB - GroupA
>> t.GroupC - GroupA
>> t.GroupC - GroupB
>>
>> p.value.GroupB - GroupA
>> p.value.GroupC - GroupA
>> p.value.GroupC - GroupB
>>
>> p.value.adj.GroupB - GroupA
>> p.value.adj.GroupC - GroupA
>> p.value.adj.GroupC - GroupB
>>
>> F
>> F.p.value
>>
>> Res.GroupB - GroupA
>> Res.GroupC - GroupA
>> Res.GroupC - GroupB
>>
>>
>> Could some body please try to explain what do the columns A,
>> Coef, F,
>> F.p.value and Res mean?
>>
>>
>> A - are your log fold change values
>> Coef - are your coefficients (you set up a cell means model, so
>> these are the sample means)
>> F - is an F-statistic, which tests the null hypothesis that none
>> of the sample means are different
>> F.p.value - is the p-value for the F-statistic
>> Res - is the results matrix you passed into write.fit(), showing
>> which contrast(s) were significant
>>
>> Best,
>>
>> Jim
>>
>>
>>
>>
>>
>> #Session info
>>
>> sessionInfo()
>>
>> R version 2.15.0 (2012-03-30)
>> Platform: i386-pc-mingw32/i386 (32-bit)
>>
>> locale:
>> [1] LC_COLLATE=Swedish_Sweden.1252 LC_CTYPE=Swedish_Sweden.1252
>> LC_MONETARY=Swedish_Sweden.1252 LC_NUMERIC=C
>> LC_TIME=Swedish_Sweden.1252
>>
>> attached base packages:
>> [1] stats graphics grDevices utils datasets methods
>> base
>>
>> other attached packages:
>> [1] limma_3.12.1 Biobase_2.16.0 BiocGenerics_0.2.0
>>
>> loaded via a namespace (and not attached):
>> [1] affylmGUI_1.30.0 IRanges_1.14.4 oneChannelGUI_1.22.10
>> stats4_2.15.0 tcltk_2.15.0
>> Best regards
>> JMA
>>
>> [[alternative HTML version deleted]]
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at r-project.org <mailto:Bioconductor at r-project.org>
>>
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives:
>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>
>>
>> -- James W. MacDonald, M.S.
>> Biostatistician
>> University of Washington
>> Environmental and Occupational Health Sciences
>> 4225 Roosevelt Way NE, # 100
>> Seattle WA 98105-6099
>>
>>
>
> --
> James W. MacDonald, M.S.
> Biostatistician
> University of Washington
> Environmental and Occupational Health Sciences
> 4225 Roosevelt Way NE, # 100
> Seattle WA 98105-6099
>
More information about the Bioconductor
mailing list