[BioC] Which genes are in the GO count column?

James W. MacDonald jmacdon at med.umich.edu
Tue Aug 21 15:56:15 CEST 2007


Hi Ingrid,

Ingrid H. G. Østensen wrote:
> Hi
> 
> Thanks for tips but I still a bit lost. This is what I have done:
> 
> 
>  # Lots of QC and limma things. 
> :-)                                                                             
>  
>   probe <- top2[,1] #top2 is from topTable
> 
>   sigLL <- unique(unlist(mget(probe, env=illuminaHumanv2ENTREZID, 
> ifnotfound=NA)))

It appears that unique() strips off the names, so you should probably 
substitute something like this:

sigLL <- unlist(mget(probe, illuminaHumanv2ENTREZID))
sigLL <- sigLL[!duplicated(sigLL)]

>   sigLL <- as.character(sigLL[!is.na(sigLL)])
>      
>   params <- new("GOHyperGParams", geneIds= sigLL, 
> annotation="illuminaHumanv2", ontology="CC", pvalueCutoff= 0.05,  
>   conditional=FALSE, testDirection="under")
>   hgOver <- hyperGTest(params)
>   res_filNavn <- paste(1, "_GO_summary_CC_under.html", sep = "")
>   htmlReport(hgOver,file=res_filNavn)
> 
> 
>   summary(hgOver)
>                GOCCID      Pvalue OddsRatio   ExpCount Count  
> Size                         Term
>   GO:0044422 GO:0044422 0.002652398 0.6469686  65.993365    46  
> 2345               organelle part
>   GO:0044446 GO:0044446 0.002652398 0.6469686  65.993365    46  2345 
> intracellular organelle part
>   GO:0005634 GO:0005634 0.002722435 0.7098244 112.259076    88  
> 3989                      nucleus
>   GO:0030529 GO:0030529 0.002771273 0.2913123  13.114246     4   466    
> ribonucleoprotein complex
>   GO:0044428 GO:0044428 0.008533045 0.5194381  23.920836    13   
> 850                 nuclear part
>   GO:0005840 GO:0005840 0.028727650 0.2791575   6.922971     2   
> 246                     ribosome
>   GO:0005623 GO:0005623 0.037820314 0.7152750 340.717129   331 
> 12107                         cell
>   GO:0044464 GO:0044464 0.038306007 0.7160755 340.688987   331 
> 12106                    cell part
>   GO:0031981 GO:0031981 0.046628778 0.5403759  14.352502     8   
> 510                nuclear lumen
> 
> 
>    # Find the ID in the count colunm
>    probeSetSummary(hgOver)
>   
>    # This gives me all the genes (some entrez id are dublicatet because 
> of their linkage to different probes) but I get a
>    warning message:
>   
>    Warning message:
>    The vector of geneIds used to create the GOHyperGParams object was 
> not a named vector.
>    If you want to know the probesets that contributed to this result
>    you need to pass a named vector for geneIds. 
>  
> 
> 
> 
> I have tried to make a named vector but apparently I do not understand 
> what it is, how can I make it work?
> And how can I get the probeSetSummary into a file? Any suggestions?

Sure. As I mentioned in my first email, you can use hyperG2annaffy() in 
affycoretools. Alternatively you can always use write.table().

Best,

Jim


>  
> Regards,
> Ingrid
> 
> 
> "James W. MacDonald" <jmacdon at med.umich.edu> writes:
> 
>  > Hi Ingrid,
>  >
>  > Ingrid H. G. Østensen wrote:
>  >> Hi
>  >>
>  >> I am testing for GO in my dataset and I am able to make html pages
>  >> that contains different type of information. But I was wondering if
>  >> there is some way to find out which genes are in the Count column? It
>  >> might say 2, but not which 2 genes.
>  >
>  > See probeSetSummary() in GOstats and hyperG2annaffy() in affycoretools.
>  > Note that for probeSetSummary() to work correctly you have to pass in a
>  > *named* vector of Entrez Gene IDs, which you can get by using unlist():
>  >
>  > my.named.probeids <- unlist(mget(probeID.vector,
>  > "chip.annotation.package.name"))
> 
> So assuming the OP is using GOstats, R-2.5.x, and the latest available
> version installed using biocLite...
> 
>   Please try
> 
>        help("HyperGResult-accessors")
> 
> + seth
> 
> --
> Seth Falcon | Computational Biology | Fred Hutchinson Cancer Research Center
> BioC: http://bioconductor.org/
> Blog: http://userprimary.net/user/
> 

-- 
James W. MacDonald, M.S.
Biostatistician
Affymetrix and cDNA Microarray Core
University of Michigan Cancer Center
1500 E. Medical Center Drive
7410 CCGC
Ann Arbor MI 48109
734-647-5623



More information about the Bioconductor mailing list