[BioC] limma for finding differentialy expressed genes of several groups

Fabian Grammes Fabian.Grammes at umb.no
Fri Oct 19 12:58:54 CEST 2012


The error is how you define your Group factors:
> Group <- factor(c("p1", "p1", "p2", "p2", "p3","p3","p3","p4","p4"),
> levels =
> c("GSM362180","GSM362181","GSM362188","GSM362189","GSM362192","GSM362193","GSM362194","GSM362197","GSM362198"))

will return a NA vector; using something like this should make your code work...
Group <- factor(c("p1", "p1", "p2", "p2", "p3","p3","p3","p4","p4"),
                       levels ="p1","p2", "p3","p4"))


Message: 1
Date: Thu, 18 Oct 2012 07:25:42 -0400
From: Sean Davis <sdavis2 at mail.nih.gov>
To: "priya [guest]" <guest at bioconductor.org>
Cc: bioconductor at r-project.org, reddy.dhivyaa at gmail.com
Subject: Re: [BioC] limma for finding differentialy expressed genes of
        several groups
Message-ID:
        <CANeAVBkRXAib9Q9ToHK37mu8R5t=fCOUN7neSVFsd9z_dbf-JA at mail.gmail.com>
Content-Type: text/plain

On Thu, Oct 18, 2012 at 4:24 AM, priya [guest] <guest at bioconductor.org>wrote:

>
> I would like to find the differentially expressed genes for several
> variables using the limma package for several groups.
> I have the rma normalized matrix in the following format :
>
>
> ID_REF GSM362180    GSM362181  GSM362188    GSM362189  GSM362192
> 244901 5.094871713 4.626623079 4.554272515 4.748604391 4.759221647
> 244902 5.194528083 4.985930299 4.817426064 5.151654407 4.838741605
> 244903 5.412329253 5.352970877 5.06250609  5.305709079 8.365082403
> 244904 5.529220594 5.28134657  5.467445095 5.62968933  5.458388909
> 244905 5.024052699 4.714631878 4.792865831 4.843975286 4.657188246
> 244906 5.786557533 5.242403911 5.060605782 5.458148567 5.890061836
>
> where the different columns correspond to four different types of
> promoters and each of the four promoters has a biological replicate so
> totally there are 8 columns.There are totally 22810 genes and I would like
> to get a list of the genes which are differentially expressed
>
> I tried using the Limma package to find the differentially expressed genes
> across several promoters ( with replicates).
> This is the code that I used:
>
> Group <- factor(c("p1", "p1", "p2", "p2", "p3","p3","p3","p4","p4"),
> levels =
> c("GSM362180","GSM362181","GSM362188","GSM362189","GSM362192","GSM362193","GSM362194","GSM362197","GSM362198"))
>
> design <- model.matrix(~0 + Group)
>
> colnames(design) <-
> c("GSM362180","GSM362181","GSM362188","GSM362189","GSM362192","GSM362193","GSM362194","GSM362197")
> fit <- lmFit(modified, design)
>
> where modified is the rma normalized data matrix as inputted in the above
> format.
> I get the following error:
>
> Coefficients not estimable: GSM362180 GSM362181 GSM362188 GSM362189
> GSM362192 GSM362193 GSM362194 GSM362197 GSM362198
> Error in lm.fit(design, t(M)) : 0 (non-NA) cases
>
>
> I managed to get help from the mailing list prior to this and was able to
> correct it in the following way.
>
>
>  -- output of sessionInfo():
>
> Group <- factor(c("p1", "p1", "p2", "p2", "p3","p3","p3","p4","p4"))
> design <- model.matrix(~0+Group)
> colnames(design) <- gsub("Group","", colnames(design))
>
> For creating the contrast matrix I proceeded as :
> fit<-lmFit(modified,design)
> fit<-ebayes(fit)
> fit<-lmFit(modified,design)
>
> contrast.matrix<-makeContrasts(p1-p2,p1-p3,p1-p4,p2-p3,p2-p4,p3-p4,levels=design)
> fit2<-contrasts.fit(fit,contrast.matrix)
> fit2<-eBayes(fit2)
> topTable(fit2,coef=1,adjust="fdr")
> logFC   AveExpr         t      P.Value    adj.P.Val         B
> 14865 -3.063442 11.939646 -20.85957 5.020817e-09 8.235097e-05 10.906936
> 15107 -3.316203 13.136888 -19.79194 8.041764e-09 8.235097e-05 10.543106
> 12037  2.806403 10.772050  19.10380 1.103823e-08 8.235097e-05 10.292274
> 15931 -3.469330 10.325303 -18.53793 1.444120e-08 8.235097e-05 10.075671
> 18327  3.198993  9.633795  17.57118 2.328424e-08 8.331092e-05  9.682365
> 7521  -2.419999  7.373064 -17.16080 2.873576e-08 8.331092e-05  9.505924
> 16564  3.268568  8.365454  17.09028 2.980775e-08 8.331092e-05  9.475007
> 3832  -2.685268  7.540418 -16.89167 3.307237e-08 8.331092e-05  9.386966
> 10364  2.466369  6.779762  16.71021 3.640344e-08 8.331092e-05  9.305265
> 4967  -2.453614 11.409188 -16.62282 3.813877e-08 8.331092e-05  9.265479
> o<-order(fit2$F.p.value)
> fit2$genes[o[1:30],]
>
>
> After the above step I get as NULL. I do not know where am making the
> mistake.
>
>
> clas <- decideTests(fit2, method = "nestedF",
> +                     adjust.method = "fdr", p = 0.05)
>
>
> I get the following output which I know is quite wrong :
> Contrasts
>        p1 - p2 p1 - p3 p1 - p4 p2 - p3 p2 - p4 p3 - p4
>   [1,]       0       0       0       0       0       0
>   [2,]       0       0       0       0       0       0
>   [3,]       0       0       0       0       0       0
>   [4,]       0       0       0       0       0       0
>   [5,]       0       0       0       0       0       0
>   [6,]       0       0       0       0       0       0
>

Hi, Priya.

Nice job moving forward with your analysis.

The output above looks fine to me.  If you read the help for decideTests,
you will notice that the output is 0/1 where a 1 signifies that a given
probe was "significant" for a given contrast.  What makes you think your
results are wrong?

Sean

        [[alternative HTML version deleted]]




More information about the Bioconductor mailing list