[BioC] annotations for Codelink arrays

John Zhang jzhang at jimmy.harvard.edu
Mon Oct 17 15:39:33 CEST 2005


>So in this case, if some probes map to differents Entrez Gene ID's (that 
>is the case of some of the MULTIPLE probes in this chips, at least with 
>the company mappings) then it will be taken only one of the Entrez Gene 
>ID's (the smallest). I will have to check the company's mappings for these 
>probes to Entrez Gene or maybe not use it at all and be confident on 
>AnnBuilder method (best way a think).

One to many mappings is always a problem as far as annotation is concerned. 
AnnBuilder makes a choice (may not be the best one) for the users when there are 
multiple Entrez Gene mappings for a given probe id. I would like to invite 
comments on what would be the best way of handling this situation. 


>
>But how can I use a mixture of genebank ids (for most of the probes) and 
>unigene ids (for some of them)? If I use "gb" as baseMapType I will not 
>get the mapping for the unigene ids. If I use "ug" then the same for the 
>genbank ids. Cannot use the unigene ids in otherSrc because this can only 
>use Entrez ids. I worked a little with this with no good result. This is 
>briefly what I do:

Currently there is no parser for both GB and UniGene ids. I will look into 
writing one. The go around for now is probably to map by GB and UG separately 
and then merge the results

>
>gb.txt: File with mappings from probe ids to genbank ids.
>Sometimes I used a file ll.txt with mappings from probe ids. to locuslink 
>ids (mappings from the company) in otherSrc

It is always a good idea to include otherSrc. AnnBuilder has a voting machenism 
that takes the mapping with the most votes from differenct sources.


>
>> library(AnnBuilder)
>> myBase <- file.path("gb.txt")
>> myBaseType <- "gb"
>> mySrcUrls <- getSrcUrl("all", organism="Rattus norvegicus")
>> myDir <- tempdir()
>> ABPkgBuilder(baseName=myBase, srcUrls=mySrcUrls, baseMapType=myBaseType, 
>> pkgPath=myDir, organism="Rattus norvegicus", ... other parameters ...)
>
>
>Thank you again for your help. I think this package is great and the best 
>way to deal with the nightmare of annotations out there.
>
>D.
>
>
>> >
>> >Thanks.
>> >
>> >D.
>> >
>> >El 13/10/2005, a las 3:14, Robert Gentleman escribió:
>> >
>> >> Hi Tao,
>> >>   If the right set of mappings is available to get started, AnnBuilder
>> >> is pretty easy to use. We can help you with the first one or two, and
>> >> are happy to distribute them. If there is more widespread interest  
>> >> (and
>> >> they are stable) we can add them to the build process.
>> >>
>> >>   Robert
>> >>
>> >> Shi, Tao wrote:
>> >>
>> >>> Any plans to create annotation packages for Codelink arrays?
>> >>>
>> >>> ...Tao
>> >>>
>> >>> _______________________________________________
>> >>> Bioconductor mailing list
>> >>> Bioconductor at stat.math.ethz.ch
>> >>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> >>>
>> >>>
>> >>
>> >> -- 
>> >> Robert Gentleman, PhD
>> >> Program in Computational Biology
>> >> Division of Public Health Sciences
>> >> Fred Hutchinson Cancer Research Center
>> >> 1100 Fairview Ave. N, M2-B876
>> >> PO Box 19024
>> >> Seattle, Washington 98109-1024
>> >> 206-667-7700
>> >> rgentlem at fhcrc.org
>> >>
>> >> _______________________________________________
>> >> Bioconductor mailing list
>> >> Bioconductor at stat.math.ethz.ch
>> >> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> >>
>> >
>> >_______________________________________________
>> >Bioconductor mailing list
>> >Bioconductor at stat.math.ethz.ch
>> >https://stat.ethz.ch/mailman/listinfo/bioconductor
>> 
>> Jianhua Zhang
>> Department of Medical Oncology
>> Dana-Farber Cancer Institute
>> 44 Binney Street
>> Boston, MA 02115-6084
>> 

Jianhua Zhang
Department of Medical Oncology
Dana-Farber Cancer Institute
44 Binney Street
Boston, MA 02115-6084



More information about the Bioconductor mailing list