[BioC] Affy data analysis

James W. MacDonald jmacdon at uw.edu
Wed Apr 18 15:15:27 CEST 2012


http://bioconductor.org/packages/release/bioc/vignettes/GOstats/inst/doc/GOstatsHyperG.pdf

On 4/17/2012 11:41 PM, hsharm03 at students.poly.edu wrote:
> Dear James,
> So I was able to get the ENTREZ ID and GENENAME . Then I was able to 
> annotate it using GO and now I have 8 GOids of the top genes in my 
> analysis . How do I use the GO IDs to find out Pathway analysis. I 
> checked KEGG but it seems it takes in KO or EC format as input. Any 
> help is much appreciated and I would also like to thank you for your 
> help till now.
> Thanks,
> Himanshu Sharma.
>
> ------------------------------------------------------------------------
> From: hsharm03 at students.poly.edu
> To: jmacdon at uw.edu; bioconductor at r-project.org
> Subject: RE: [BioC] Affy data analysis
> Date: Tue, 17 Apr 2012 19:20:01 +0000
>
> Dear James,
> Thanks a lot for your help. I will try it again.
> Thanks,
> Himanshu Sharma.
>
> > Date: Tue, 17 Apr 2012 15:15:49 -0400
> > From: jmacdon at uw.edu
> > To: hsharm03 at students.poly.edu
> > CC: bioconductor at r-project.org
> > Subject: Re: [BioC] Affy data analysis
> >
> > Hi Himanshu,
> >
> > On 4/17/2012 2:54 PM, hsharm03 at students.poly.edu wrote:
> > > Dear James ,
> > > I was able to get the topTable using the method you told me to follow
> > > from the limma manual. But when I try to annotate the genes using
> > > ENTREZ , it requires id's but the topTable we found does not have the
> > > Id's tab. So what should I be doing?.
> >
> > If you are using an ExpressionSet when you run lmFit(), then you should
> > automatically get the ID column in your topTable() output. If not, note
> > that the row.names for your topTable() output correspond to the rows of
> > your data, so you can get the probeset IDs that way.
> >
> > Best,
> >
> > Jim
> >
> >
> > > Thanks,
> > > Himanshu Sharma.
> > >
> > > > Date: Mon, 16 Apr 2012 09:49:10 -0400
> > > > From: jmacdon at uw.edu
> > > > To: hsharm03 at students.poly.edu
> > > > CC: bioconductor at r-project.org
> > > > Subject: Re: [BioC] Affy data analysis
> > > >
> > > > Hi Himanshu Sharma,
> > > >
> > > > On 4/14/2012 6:42 PM, hsharm03 at students.poly.edu wrote:
> > > > > Dear all,I have data from affy HT430mgpm and I need to analyze 
> the
> > > data for differential expression and pathway analysis. I have 3
> > > wildtype controls (Wt neurospheres 2 and 3) for the control analysis.
> > > I have two other tumors (1509 and 1701) for the analysis. From the 
> cel
> > > files, it doesn’t appear that we did replicates for the tumors, just
> > > one each, the rationale at the time being that we had wanted to first
> > > quickly scan the tumors for common signatures. Those genes that are
> > > clearly highly expressed should however represent additional 
> oncogenic
> > > signatures, that may stem from the same or related activating
> > > pathways.For now, my analysis for controls should give me an accurate
> > > expression data for the controls. The tumors will have to be compared
> > > across the samples to look for the low hanging fruits.??I am not sure
> > > how do I go about doing this since I have 3 replicates for the 
> control
> > > but 1 each for different tumors. What should be the strategy that I
> > > should use in order to do my analysis.
> > > >
> > > > You can just analyze your data as indicated in the limma User's 
> Guide.
> > > > Note that although you only have one sample for each of the tumor
> > > > samples, since you have three replicates for the control you end up
> > > with
> > > > 2 degrees of freedom, so can actually fit a model and compute
> > > contrasts.
> > > > Here is an example using some fake data:
> > > >
> > > > > x <- matrix(rnorm(5e5), ncol = 5)
> > > > > design <- model.matrix(~factor(rep(1:3, c(3,1,1))))
> > > > > fit <- lmFit(x, design)
> > > > > fit2 <- eBayes(fit)
> > > > > topTable(fit2, 2)
> > > > logFC t P.Value adj.P.Val B
> > > > 27913 -5.164721 -4.474076 7.678459e-06 0.6669534 -4.402008
> > > > 98975 4.907831 4.251539 2.124031e-05 0.6669534 -4.421736
> > > > 90287 4.800002 4.158128 3.209996e-05 0.6669534 -4.429717
> > > > 41684 -4.754741 -4.118920 3.808058e-05 0.6669534 -4.433015
> > > > 43210 -4.711426 -4.081397 4.478309e-05 0.6669534 -4.436141
> > > > 46761 4.705393 4.076171 4.580108e-05 0.6669534 -4.436574
> > > > 37345 -4.687702 -4.060846 4.891387e-05 0.6669534 -4.437841
> > > > 98788 4.633203 4.013635 5.981260e-05 0.6669534 -4.441714
> > > > 46584 4.606493 3.990496 6.595873e-05 0.6669534 -4.443596
> > > > 72789 -4.603451 -3.987861 6.669534e-05 0.6669534 -4.443809
> > > > > topTable(fit2, 3)
> > > > logFC t P.Value adj.P.Val B
> > > > 19401 -5.232576 -4.532857 5.822486e-06 0.5822486 -1.796077
> > > > 883 4.813581 4.169892 3.048726e-05 0.8544860 -2.252617
> > > > 87408 -4.667879 -4.043673 5.263993e-05 0.8544860 -2.402452
> > > > 76730 4.641339 4.020682 5.805112e-05 0.8544860 -2.429249
> > > > 50261 4.533133 3.926946 8.605996e-05 0.8544860 -2.536920
> > > > 63980 4.502927 3.900780 9.591473e-05 0.8544860 -2.566524
> > > > 783 -4.498102 -3.896600 9.758446e-05 0.8544860 -2.571235
> > > > 59496 -4.441207 -3.847313 1.194575e-04 0.8544860 -2.626398
> > > > 92491 4.427735 3.835642 1.252750e-04 0.8544860 -2.639357
> > > > 22351 -4.420041 -3.828977 1.287163e-04 0.8544860 -2.646741
> > > >
> > > > As you can see, limma is happy to run the analysis without any
> > > > replication for two of the sample types.
> > > >
> > > > Best,
> > > >
> > > > Jim
> > > >
> > > >
> > > > > Thanks,Himanshu Sharma
> > > > > [[alternative HTML version deleted]]
> > > > >
> > > > >
> > > > >
> > > > > _______________________________________________
> > > > > Bioconductor mailing list
> > > > > Bioconductor at r-project.org
> > > > > https://stat.ethz.ch/mailman/listinfo/bioconductor
> > > > > Search the archives:
> > > http://news.gmane.org/gmane.science.biology.informatics.conductor
> > > >
> > > > --
> > > > James W. MacDonald, M.S.
> > > > Biostatistician
> > > > University of Washington
> > > > Environmental and Occupational Health Sciences
> > > > 4225 Roosevelt Way NE, # 100
> > > > Seattle WA 98105-6099
> > > >
> > > >
> >
> > --
> > James W. MacDonald, M.S.
> > Biostatistician
> > University of Washington
> > Environmental and Occupational Health Sciences
> > 4225 Roosevelt Way NE, # 100
> > Seattle WA 98105-6099
> >

-- 
James W. MacDonald, M.S.
Biostatistician
University of Washington
Environmental and Occupational Health Sciences
4225 Roosevelt Way NE, # 100
Seattle WA 98105-6099



More information about the Bioconductor mailing list