[BioC] annotations for Codelink arrays
Diego Diez Ruiz
ddiez at iib.uam.es
Sat Oct 15 20:34:44 CEST 2005
thank you very much for your response, I have some doubts on your advices:
On Fri, 14 Oct 2005, John Zhang wrote:
> >1) In the case of MULTIPLE probes: Can AnnBuilder find when a
> >coherent mapping for different Genbank Accession numbers to Entrez
> >Gene exists and then use this mapping? or when it find two Genbank
> >acc. associated to one probe it avoids mapping at all?
> When a probe is mapped to multiple Genbank Accession numbers (separated by a ";"
> in the base file), AnnBuilder tries to get the mappings of these GB numbers to
> Entrez ids using both UniGene and Entrez as the source and then figures out if
> the two sources agree. If they do not agree, the one with the smallest Entrez id
> is used. Based on my previous experience, the two sources usually agree except
> for ESTs that can only be mapped by UniGene.
So in this case, if some probes map to differents Entrez Gene ID's (that
is the case of some of the MULTIPLE probes in this chips, at least with
the company mappings) then it will be taken only one of the Entrez Gene
ID's (the smallest). I will have to check the company's mappings for these
probes to Entrez Gene or maybe not use it at all and be confident on
AnnBuilder method (best way a think).
> >2) For the CODELINK_UNIQUE: Until we can get the mappings to Genbank
> >acc. Is there any possibility to use the mappings to Unigene?.
> Yes, UniGene id can be used. Use "ug" for baseMapType when calling ABPkgBuilder.
But how can I use a mixture of genebank ids (for most of the probes) and
unigene ids (for some of them)? If I use "gb" as baseMapType I will not
get the mapping for the unigene ids. If I use "ug" then the same for the
genbank ids. Cannot use the unigene ids in otherSrc because this can only
use Entrez ids. I worked a little with this with no good result. This is
briefly what I do:
gb.txt: File with mappings from probe ids to genbank ids.
Sometimes I used a file ll.txt with mappings from probe ids. to locuslink
ids (mappings from the company) in otherSrc
> myBase <- file.path("gb.txt")
> myBaseType <- "gb"
> mySrcUrls <- getSrcUrl("all", organism="Rattus norvegicus")
> myDir <- tempdir()
> ABPkgBuilder(baseName=myBase, srcUrls=mySrcUrls, baseMapType=myBaseType,
> pkgPath=myDir, organism="Rattus norvegicus", ... other parameters ...)
Thank you again for your help. I think this package is great and the best
way to deal with the nightmare of annotations out there.
> >El 13/10/2005, a las 3:14, Robert Gentleman escribió:
> >> Hi Tao,
> >> If the right set of mappings is available to get started, AnnBuilder
> >> is pretty easy to use. We can help you with the first one or two, and
> >> are happy to distribute them. If there is more widespread interest
> >> (and
> >> they are stable) we can add them to the build process.
> >> Robert
> >> Shi, Tao wrote:
> >>> Any plans to create annotation packages for Codelink arrays?
> >>> ...Tao
> >>> _______________________________________________
> >>> Bioconductor mailing list
> >>> Bioconductor at stat.math.ethz.ch
> >>> https://stat.ethz.ch/mailman/listinfo/bioconductor
> >> --
> >> Robert Gentleman, PhD
> >> Program in Computational Biology
> >> Division of Public Health Sciences
> >> Fred Hutchinson Cancer Research Center
> >> 1100 Fairview Ave. N, M2-B876
> >> PO Box 19024
> >> Seattle, Washington 98109-1024
> >> 206-667-7700
> >> rgentlem at fhcrc.org
> >> _______________________________________________
> >> Bioconductor mailing list
> >> Bioconductor at stat.math.ethz.ch
> >> https://stat.ethz.ch/mailman/listinfo/bioconductor
> >Bioconductor mailing list
> >Bioconductor at stat.math.ethz.ch
> Jianhua Zhang
> Department of Medical Oncology
> Dana-Farber Cancer Institute
> 44 Binney Street
> Boston, MA 02115-6084
More information about the Bioconductor