[BioC] issue when doing program.

Tue Mar 19 15:17:02 CET 2013

Hi Laya,

Please don't take questions off-list.

On 3/19/13 7:36 AM, Laya Rose wrote:
> Sir
>  I met with a problem regarding the gene name of the data that i have 
> done using R/Biocondutor. The answer that i got by doing the program is
> ##chronic stress effect on pheripheral blood monocyte(raw data from NCBI)
> > library(limma)
> > library(affy)
> > eset.rma <- justRMA(celfile.path="/media/Computational 
> Science/Intern/cel")
> > pData(eset.rma)
>               sample
> GSM194153.CEL      1
> GSM194154.CEL      2
> GSM194155.CEL      3
> GSM194156.CEL      4
> GSM194157.CEL      5
> GSM194158.CEL      6
> GSM194159.CEL      7
> GSM194160.CEL      8
> GSM194161.CEL      9
> GSM194162.CEL     10
> GSM194163.CEL     11
> GSM194164.CEL     12
> GSM194165.CEL     13
> GSM194166.CEL     14
> GSM194167.CEL     15
> GSM194168.CEL     16
> GSM194169.CEL     17
> GSM194170.CEL     18
> GSM194171.CEL     19
> GSM194172.CEL     20
> GSM194173.CEL     21
> > design <- model.matrix(~factor(rep(1:2, c(11,10))))
> > colnames(design) <- c("caregiver","caregivervscontrol")
> > colnames(design)
> [1] "caregiver"          "caregivervscontrol"
> > design
>    caregiver caregivervscontrol
> 1          1                  0
> 2          1                  0
> 3          1                  0
> 4          1                  0
> 5          1                  0
> 6          1                  0
> 7          1                  0
> 8          1                  0
> 9          1                  0
> 10         1                  0
> 11         1                  0
> 12         1                  1
> 13         1                  1
> 14         1                  1
> 15         1                  1
> 16         1                  1
> 17         1                  1
> 18         1                  1
> 19         1                  1
> 20         1                  1
> 21         1                  1
> attr(,"assign")
> [1] 0 1
> attr(,"contrasts")
> attr(,"contrasts")$`factor(rep(1:2, c(11, 10)))`
> [1] "contr.treatment"
>
> > fit <- lmFit(eset.rma, design)
> > fit <- eBayes(fit)
> > result <- topTable(fit, number=20, sort.by <http://sort.by>="B", 
> adjust="BH", p.value=0.05)

Here you are asking for all genes where any coefficient is different 
from zero. But your first coefficient estimates the mean of the 
caregiver samples, which will almost always be significantly different 
from zero, and is an uninteresting result.

Instead, you want to find genes where the caregiver vs control 
comparison is different from zero, which is your second coefficient. To 
do that you need to include coef=2 in your call to topTable().

Best,

Jim

> > result
>                      ID caregiver caregivervscontrol  AveExpr        F
> 216526_x_at 216526_x_at 13.291805      -0.1096321979 13.23960 85050.91
> 39854_r_at   39854_r_at 10.419915      -0.0372112598 10.40220 70391.77
> 38710_at       38710_at  9.192459      -0.0847969308  9.15208 65091.16
> 220960_x_at 220960_x_at 11.918361      -0.0693425839 11.88534 65000.11
> 211445_x_at 211445_x_at 11.331780      -0.0567108618 11.30477 63663.77
> 221798_x_at 221798_x_at 13.799294      -0.0051424692 13.79684 61259.71
> 201254_x_at 201254_x_at 13.120118      -0.0304201074 13.10563 60370.97
> 212869_x_at 212869_x_at 14.520910      -0.1448694710 14.45192 57874.70
> 207783_x_at 207783_x_at 13.569484      -0.1153729885 13.51454 56273.48
> 208768_x_at 208768_x_at 12.592456      -0.0001313416 12.59239 55726.09
> 206559_x_at 206559_x_at 14.147763       0.0375752743 14.16566 55517.79
> 200633_at     200633_at 13.460864      -0.0234022819 13.44972 55009.06
> 221775_x_at 221775_x_at 12.553841       0.0014125408 12.55451 54692.55
> 37278_at       37278_at  7.759139       0.0130228855  7.76534 54635.93
> 214327_x_at 214327_x_at 12.139203      -0.0623833002 12.10950 52902.17
> 212284_x_at 212284_x_at 13.952524      -0.0919405644 13.90874 52445.88
> 214351_x_at 214351_x_at 11.728144       0.0246027780 11.73986 51781.39
> 213477_x_at 213477_x_at 13.402430       0.0108312076 13.40759 51214.20
> 214317_x_at 214317_x_at 12.384605       0.0130429357 12.39082 50957.53
> 200926_at     200926_at 13.343766       0.0218536588 13.35417 50901.69
>                  P.Value    adj.P.Val
> 216526_x_at 2.959633e-42 6.594950e-38
> 39854_r_at  2.214834e-41 2.467657e-37
> 38710_at    5.094160e-41 2.874272e-37
> 220960_x_at 5.170593e-41 2.874272e-37
> 211445_x_at 6.449472e-41 2.874272e-37
> 221798_x_at 9.713637e-41 3.607483e-37
> 201254_x_at 1.134763e-40 3.612275e-37
> 212869_x_at 1.778350e-40 4.953372e-37
> 207783_x_at 2.396915e-40 5.223307e-37
> 208768_x_at 2.659599e-40 5.223307e-37
> 206559_x_at 2.767701e-40 5.223307e-37
> 200633_at   3.052481e-40 5.223307e-37
> 221775_x_at 3.245743e-40 5.223307e-37
> 37278_at    3.281708e-40 5.223307e-37
> 214327_x_at 4.624818e-40 6.870321e-37
> 212284_x_at 5.071294e-40 7.062728e-37
> 214351_x_at 5.808078e-40 7.139702e-37
> 213477_x_at 6.530111e-40 7.139702e-37
> 214317_x_at 6.888655e-40 7.139702e-37
> 200926_at   6.969475e-40 7.139702e-37
> like this. But when comparing with the gene names from the result and 
> the gene names specified in the corresponding paper 
> are entirely different. Why there is such a huge difference? i 
> enclosed the raw data with this mail. Can't able to identify the problem.
> On Fri, Mar 15, 2013 at 10:20 AM, Laya Rose <roselaya98 at gmail.com 
> <mailto:roselaya98 at gmail.com>> wrote:
>
>     Sir,
>      Thank you for the help. Got answer for my problem. Will surely
>     read limma user's guide.
>
>
>     On Thu, Mar 14, 2013 at 9:15 PM, James W. MacDonald
>     <jmacdon at uw.edu <mailto:jmacdon at uw.edu>> wrote:
>
>         Hi Laya,
>         Please don't take things off list.
>         I would highly recommend reading the limma user's guide, which
>         has examples that show exactly what you should be doing here.
>         But note that it is simple to create the design matrix using
>         the model.matrix() function:
>         design <- model.matrix(~factor(rep(1:2, c(11,10)))
>         colnames(design) <- c("caregiver","caregivervscontrol")
>         Best,
>         Jim
>
>
>         On Thu, Mar 14, 2013 at 2:17 AM, Laya Rose
>         <roselaya98 at gmail.com <mailto:roselaya98 at gmail.com>> wrote:
>
>             Sir
>              Thank you for your reply. Actually R/Bioconductor is new
>             to me. So very difficult for me to sort out the errors.
>             Because of that i can't able to move on to next step.
>             Files from GSM194164.CEL to GSM194173.CEL are the samples
>             of control. Then how can the issue of no controls in the
>             rownames of phenodata can occur.
>
>             On Thu, Mar 14, 2013 at 2:02 AM, James W. MacDonald
>             <jmacdon at uw.edu <mailto:jmacdon at uw.edu>> wrote:
>
>                 Hi Laya,
>
>
>
>                 On 3/13/13 4:40 AM, Laya Rose wrote:
>
>                     Hello,
>                        An issue was reported when I was doing the
>                     following commands.
>
>                         library(limma)
>                         library(affy)
>                         eset.rma <-
>                         justRMA(celfile.path="/media/Computational
>
>                     Science/Intern/cel")
>
>                         pData(eset.rma)
>
>                                    sample
>                     GSM194153.CEL      1
>                     GSM194154.CEL      2
>                     GSM194155.CEL      3
>                     GSM194156.CEL      4
>                     GSM194157.CEL      5
>                     GSM194158.CEL      6
>                     GSM194159.CEL      7
>                     GSM194160.CEL      8
>                     GSM194161.CEL      9
>                     GSM194162.CEL     10
>                     GSM194163.CEL     11
>                     GSM194164.CEL     12
>                     GSM194165.CEL     13
>                     GSM194166.CEL     14
>                     GSM194167.CEL     15
>                     GSM194168.CEL     16
>                     GSM194169.CEL     17
>                     GSM194170.CEL     18
>                     GSM194171.CEL     19
>                     GSM194172.CEL     20
>                     GSM194173.CEL     21
>
>                         a<-rep(0,length(pData(eset.rma)$sample))
>                         a[grep("control", rownames(pData(eset.rma)),
>                         ignore.case=T)] <-1
>                         a
>
>                       [1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
>
>
>                 There are no controls in the row.names of your
>                 phenoData, so you are getting all zeros. You need to
>                 fix this step so you have a 1 for each control.
>
>                 Best,
>
>                 Jim
>
>
>                         design <- cbind(caregiver=1, controlVscaregiver=a)
>                         design
>
>                            caregiver controlVscaregiver
>                       [1,]         1          0
>                       [2,]         1          0
>                       [3,]         1          0
>                       [4,]         1          0
>                       [5,]         1          0
>                       [6,]         1          0
>                       [7,]         1          0
>                       [8,]         1          0
>                       [9,]         1          0
>                     [10,]         1          0
>                     [11,]         1          0
>                     [12,]         1          0
>                     [13,]         1          0
>                     [14,]         1          0
>                     [15,]         1          0
>                     [16,]         1          0
>                     [17,]         1          0
>                     [18,]         1          0
>                     [19,]         1          0
>                     [20,]         1          0
>                     [21,]         1          0
>
>                         fit <- lmFit(eset.rma, design)
>
>                     Coefficients not estimable: controlVscaregiver
>                     Warning message:
>                     Partial NA coefficients for 22283 probe(s)
>
>                     Is this issue has reported because of the
>                     coefficients i have selected.
>                     I don't know how to resolve this issue. Thanks in
>                     advance for any help.
>
>                     -
>                     *Laya Rose*
>
>                             [[alternative HTML version deleted]]
>
>                     _______________________________________________
>                     Bioconductor mailing list
>                     Bioconductor at r-project.org
>                     <mailto:Bioconductor at r-project.org>
>                     https://stat.ethz.ch/mailman/listinfo/bioconductor
>                     Search the archives:
>                     http://news.gmane.org/gmane.science.biology.informatics.conductor
>
>
>
>
>
>             -- 
>             *Laya Rose*
>
>
>
>
>
>     -- 
>     *Laya Rose*
>
>
>
>
> -- 
> *Laya Rose*
>