[BioC] hyperGTest html report

James W. MacDonald jmacdon at med.umich.edu
Thu Jan 10 15:42:17 CET 2008


Sebastien Gerega wrote:
> Thanks for that!
> I can now almost get what I want.....
> Here is the code I use:
> 
> hgOver = hyperGTest(params)
> report = summary(hgOver, htmlLinks=TRUE)
> cats = sigCategories(hgOver)
> reportGenes = vector()
> 
> for(i in 1:length(cats)){
>     reportGenes = append(reportGenes, geneIdsByCategory(hgOver, cats[i]))
> }
> 
> This gives me reportGenes as a list something like this:
> 
> $`04650`
> [1] 10451  4277  5296  5880  6464  8743  8795  8797
> 
> $`04670`
> [1] 10451  1365  5296  5829  5880  6387  6494    87  9564
> 
> $`00150`
> [1]  3291 51451  6715
> 
> $`04080`
> [1]  154 2150 4886 4923 7433
> 
> $`04360`
> [1] 10512  1969  2043 56920 57522 57556  5880  6387
> 
> I would then like to run the following code:
> 
> report <- data.frame(report, reportGenes)
> xtab <- xtable(report, caption="A Caption")
> print(xtab, type="html", file="Afile.html", caption.placement="top", 
> sanitize.text.function=function(x) x, include.rownames=FALSE)
> 
> But I get the following error:
> Error in data.frame("04650" = c(10451L, 4277L, 5296L, 5880L, 6464L, 
> 8743L,  :
>   arguments imply differing number of rows: 8, 9, 3, 5, 7

This is the part where I said you have to wrap the Entrez Gene IDs in 
<P>EGID</P> so you can a.)have a vector of the correct length, and b.) 
create a table that will be readable.

Something like this should suffice:

rg.out <- sapply(reportGenes, function(x)
  paste("<P>", paste(x, collapse="</P><P>"), "</P>", sep=""))

then use rg.out in lieu of reportGenes when making the data.frame.

Best,

Jim


> 
> How should I deal with this list so that I can add it to the data.frame? 
> And are there any faster ways to do what I have done in this code?
> I am still getting used to R.
> thanks heaps,
> Sebastien
> 
> James W. MacDonald wrote:
>> Hi Sebastien,
>>
>> Maybe not directly, but note that htmlReport() is simply using xtable 
>> to create the HTML page using the output from summary(). So you could 
>> just create the table and then add a column of Entrez Gene IDs and 
>> then output the result.
>>
>> Say your GOHyperGResult object is called 'hypt':
>>
>> out <- summary(hyp, summary.args=list(htmlLinks=TRUE, categorySize=10))
>>
>> Note that the categorySize argument isn't necessary, but does protect 
>> you from choosing arguably spurious results (like a GO term with 3 
>> genes in the universe and 1 that was significant).
>>
>> Now you are going to have to create a vector containing all the Entrez 
>> Gene IDs for each GO term. For this to work in HTML, you will also 
>> need to separate each ID with a <P>EntreGeneID</P>, so you will need 
>> to either cat() or paste() things together. Once you have that, just 
>> add to the data.frame created above:
>>
>> out <- data.frame(out, entregeneidvector)
>> xtab <- xtable(out, caption="A Caption", digits=rep(c(3,0), c(4,8)))
>> print(xtab, type="html", file="A file name.html", 
>> caption.placement="top", sanitize.text.function=function(x) x, 
>> include.rownames=FALSE)
>>
>> HOWEVER, that might not really be what you want, as it will obviously 
>> be a bit of work, and could get really messy if there are dozens of 
>> Entrez Gene IDs for a particular GO term. An alternative is to output 
>> individual HTML tables for each GO term of interest that list out the 
>> probesets that contributed to the significance of that term. For that 
>> you might want to look at hyperGoutput() in the affycoretools package.
>>
>> Best,
>>
>> Jim
>>
>>
>> Sebastien Gerega wrote:
>>> Hi,
>>> is there any way to get additional information into the hyperGTest 
>>> html report?
>>> Specifically, I would like to include the Entrez IDs for the genes 
>>> contributing to
>>> each overrepresented GO term.
>>> thanks,
>>> Sebastien
>>>
>>> _______________________________________________
>>> Bioconductor mailing list
>>> Bioconductor at stat.math.ethz.ch
>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>> Search the archives: 
>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
> 
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

-- 
James W. MacDonald, M.S.
Biostatistician
Affymetrix and cDNA Microarray Core
University of Michigan Cancer Center
1500 E. Medical Center Drive
7410 CCGC
Ann Arbor MI 48109
734-647-5623



More information about the Bioconductor mailing list