[BioC] HuGene annotation and htmls

cstrato cstrato at aon.at
Fri Apr 10 21:22:09 CEST 2009


Dear Mayte

Everything is fine with your code, nothing to worry about.

If you look at column "gene_assignment" of 
"HuGene-1_0-st-v1.na28.hg18.transcript.csv" you will see many NAs, e.g.:

 > getSYMBOL("7896740", "hugene10st")
 7896740
"OR4F17"
 > getSYMBOL("7896746", "hugene10st")
7896746
     NA

Best regards
Christian


Mayte Suarez-Farinas wrote:
> You are right James!!!
> with the keys James  sent the package hugene10st  work just fine.
> so it looks like the "error" come from my use of xps.
>
> here is my code:
>
> library(xps)
>
> ### define directories:
> # directory containing Affymetrix library files
> libdir <- "/Users/Mayte/Rlibrary/AffyDB/libraryfiles"
> anndir <- "/Users/Mayte/Rlibrary/AffyDB/Annotation"
> scmdir <- "/Users/Mayte/Rlibrary/AffyDB/ROOTSchemes"
>
> scheme.hugene10stv1r4.na28 <- import.exon.scheme 
> ("Scheme_HuGene10stv1r4_na28",filedir=scmdir,
>                                layoutfile=paste(libdir,"HuGene-1_0-st- 
> v1.r4.clf",sep="/"),
>                                schemefile=paste(libdir,"HuGene-1_0-st- 
> v1.r4.pgf",sep="/"),
>                                probeset=paste(anndir,"HuGene-1_0-st- 
> v1.na28.hg18.probeset.csv",sep="/"),
>                                transcript=paste(anndir,"HuGene-1_0-st- 
> v1.na28.hg18.transcript.csv",sep="/"))
>
> scheme.hugene10stv1r4 <- root.scheme(paste(scmdir,  
> "Scheme_HuGene10stv1r4_na28.root",sep = "/"))
> G1ST_data<-import.data(scheme.hugene10stv1r4, "Pamela_G1ST_dataxps",  
> celdir=getwd(), celfiles = as.character(PD[1:8,'CELfile']), verbose =  
> FALSE)
> G1ST_rma_xps <- rma(G1ST_data, "Pamela_G1ST_rma_t",  
> background="antigenomic", option="transcript", exonlevel="core+affx",  
> normalize=T)
>
> The "featureNames" of the data (or keys) can be  taken as:
>
> keys<-as.character(exprs(G1ST_rma_xps)$UnitName)
>
> but almost half them do not have symbol:
>
> sum(!is.na(getSYMBOL(keys, "hugene10st")))
> [1] 19899
> sum(is.na(getSYMBOL(keys, "hugene10st")))
>   9027
>
> Is this OK ? or is there any mistake in my code??
>
> Thanks in advance for everybody help!!!
> and sorry for bothering so many times!
>
> Mayte
>
> On Apr 10, 2009, at 10:55 AM, James W. MacDonald wrote:
>
>   
>> I wonder if this is a problem with how the package was built. The  
>> numbers that Marc supplied are the Exon Probeset IDs, but the Lkeys  
>> of the hugene10st.db package seem to be what Affy calls the  
>> Transcript Cluster ID.
>>
>>     
>>> keys <- c("7903188","7903203")
>>> getSYMBOL(keys, "hugene10st")
>>>       
>> 7903188 7903203
>> "PTBP2"  "SNX7"
>>
>> Best,
>>
>> Jim
>>
>>
>>
>> Mayte Suarez-Farinas wrote:
>>     
>>> I meant that the usual functions from annotate does not work.
>>> When I ran your code, I get:
>>> library("annotate")
>>>  > library("hugene10st.db")
>>>  > keys = c("7903193","7903204")
>>>  >
>>>  > getSYMBOL(keys, "hugene10st")
>>> 7903193 7903204
>>>      NA      NA
>>>  >
>>>  > lookUp(keys, "hugene10st" , "CHR")
>>> $`7903193`
>>> [1] NA
>>> $`7903204`
>>> [1] NA
>>>  > lookUp(keys, "hugene10st" , "ENTREZID")
>>> $`7903193`
>>> [1] NA
>>> $`7903204`
>>> [1] NA
>>> sessionInfo()
>>> R version 2.8.1 (2008-12-22)
>>> i386-apple-darwin8.11.1
>>> locale:
>>> en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8
>>> attached base packages:
>>> [1] splines   tools     stats     graphics  grDevices utils      
>>> datasets  methods   base
>>> other attached packages:
>>>  [1] hugene10st.db_1.0.2  statmod_1.3.8         
>>> beadarray_1.10.0     sma_0.5.15           hwriter_1.0
>>>  [6] affycoretools_1.14.1 annaffy_1.14.0        
>>> KEGG.db_2.2.5        biomaRt_1.16.0       GOstats_2.8.0
>>> [11] Category_2.8.4       RBGL_1.18.0           
>>> GO.db_2.2.5          RSQLite_0.7-1        DBI_0.2-4
>>> [16] graph_1.20.0         limma_2.16.4          
>>> affyQCReport_1.20.0  geneplotter_1.20.0   annotate_1.20.1
>>> [21] AnnotationDbi_1.5.18 lattice_0.17-17       
>>> RColorBrewer_1.0-2   affyPLM_1.18.1       preprocessCore_1.4.0
>>> [26] xtable_1.5-4         simpleaffy_2.18.0     
>>> gcrma_2.14.1         matchprobes_1.14.1   genefilter_1.22.0
>>> [31] survival_2.34-1      affy_1.20.2          Biobase_2.2.2
>>> loaded via a namespace (and not attached):
>>> [1] GSEABase_1.4.0     KernSmooth_2.22-22 RCurl_0.94-1        
>>> XML_2.1-0          affyio_1.10.1
>>> [6] cluster_1.11.11    grid_2.8.1         xps_1.2.8
>>> On Apr 9, 2009, at 5:26 PM, Marc Carlson wrote:
>>>       
>>>> Hi Mayte,
>>>>
>>>> I can't tell from your post what you tried to do, or even what  
>>>> exactly
>>>> you need to know.  Please give us the code you were trying to  
>>>> use, along
>>>> with an example that didn't behave the way you expected it to and  
>>>> you
>>>> the results of calling sessionInfo() after you did that. You can  
>>>> find
>>>> other helpful tips on the posting guide:
>>>>
>>>> http://www.bioconductor.org/docs/postingGuide.html
>>>>
>>>> What little I can discern from your post I will try to answer.   
>>>> To use
>>>> getSYMBOL() or lookUp(), you need to 1st of all make sure that  
>>>> you have
>>>> loaded the annotate package.  Then you need to call it  
>>>> correctly.  Here
>>>> is an example that I did using the very latest version of the
>>>> hugene10st.db package.
>>>>
>>>> library("annotate")
>>>> library("hugene10st.db")
>>>> keys = c("7903193","7903204")
>>>>
>>>> getSYMBOL(keys, "hugene10st")
>>>>
>>>> lookUp(keys, "hugene10st" , "CHR")
>>>> lookUp(keys, "hugene10st" , "ENTREZID")
>>>>
>>>>
>>>>
>>>> Hope this helps,
>>>>
>>>>
>>>>
>>>>   Marc
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> Mayte Suarez-Farinas wrote:
>>>>         
>>>>> I am learning to work with the HuGene ST1 chips.
>>>>> I was able to use xps to read and preprocess the files
>>>>> and then I convert to ExpressionSet class to use limma
>>>>> for modelling.
>>>>> Next step I stop: the annotation.
>>>>> I load  library("hugene10st.db") but the normal functions
>>>>> to create html annotation does not seems to work on this chip.
>>>>> I also try to get each component using getSYMBOL and lookUP
>>>>> with no success.
>>>>> what's the way to go???
>>>>>
>>>>> Thanks
>>>>>
>>>>> Mayte
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>     [[alternative HTML version deleted]]
>>>>>
>>>>> _______________________________________________
>>>>> Bioconductor mailing list
>>>>> Bioconductor at stat.math.ethz.ch
>>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>>>> Search the archives: http://news.gmane.org/ 
>>>>> gmane.science.biology.informatics.conductor
>>>>>
>>>>>
>>>>>           
>>> _______________________________________________
>>> Bioconductor mailing list
>>> Bioconductor at stat.math.ethz.ch
>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>> Search the archives: http://news.gmane.org/ 
>>> gmane.science.biology.informatics.conductor
>>>       
>> -- 
>> James W. MacDonald, M.S.
>> Biostatistician
>> Douglas Lab
>> University of Michigan
>> Department of Human Genetics
>> 5912 Buhl
>> 1241 E. Catherine St.
>> Ann Arbor MI 48109-5618
>> 734-615-7826
>>     
>
>
> 	[[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>
>
>



More information about the Bioconductor mailing list