[BioC] annotation in Ensembl using biomart

Jason Lu jasonlu68 at gmail.com
Thu Mar 11 19:32:37 CET 2010


Dear Dr Huber,

Thanks for the reply.

You are right that CDC2L1 is the previous name for CDK11B, and CDC2L2
for CDK11A. I guess I was confused by the output given by BioMart, in
which the match between old names and new names totally are random
(see the previous post). Could be errors in BioMart (table-join)?

Thanks again,
Jason


On Thu, Mar 11, 2010 at 12:42 PM, Wolfgang Huber <whuber at embl.de> wrote:
>
> Dear Jason
>
> a quick look at the HGNC website (http://www.genenames.org) will tell you
> that CDC2L1 is the previous name for CDK11B (the currently approved gene
> symbol) and similarly CDC2L2 for CDK11A and furthermore that Ensembl as well
> as the UCSC genome browser in the meanwhile map them to the same place in
> the reference genome and consider them isoforms of the same gene:
> http://www.genenames.org/data/hgnc_data.php?hgnc_id=1729
> http://www.genenames.org/data/hgnc_data.php?hgnc_id=1730
>
> OTOH, Entrez and UniProt consider them as separate genes ("Duplicated gene.
> CDK11A and CDK11B encode almost identical protein kinases of 110 kDa that
> ..."): http://www.uniprot.org/uniprot/Q9UQ88
>
> Biology, and the history of biological discovery, can be messy...
> Other people might have more insight, but I bet it is a long story :)
>
>        Wolfgang
>
>
> Jason Lu scripsit 11/03/10 17:06:
>>
>> Hi all,
>>
>> I wonder whether I could help from this list. Sorry if this is a
>> duplicate question.
>>
>> I get confused with the following mapping (by using the BioMart
>> website). They share the same ENSG. My purpose is to match ENSG to a
>> gene symbol.
>> Do you have any suggestion which one I should use?
>> Thanks,
>>
>>
>> Ensembl Gene ID Ensembl Transcript ID HGNC symbol HGNC curated gene name
>> ENSG00000008128 ENST00000401097 CDK11B CDC2L2
>> ENSG00000008128 ENST00000401097 CDK11A CDC2L2
>> ENSG00000008128 ENST00000401097 CDK11B CDC2L1
>> ENSG00000008128 ENST00000401097 CDK11A CDC2L1
>> ENSG00000008128 ENST00000341832 CDK11B CDC2L2
>> ENSG00000008128 ENST00000341832 CDK11A CDC2L2
>> ENSG00000008128 ENST00000341832 CDK11B CDC2L1
>> ENSG00000008128 ENST00000341832 CDK11A CDC2L1
>> ENSG00000008128 ENST00000407249 CDK11B CDC2L2
>> ENSG00000008128 ENST00000407249 CDK11A CDC2L2
>>
>> Jason
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at stat.math.ethz.ch
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives:
>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>
>
> --
>
> Best wishes
>     Wolfgang
>
>
> --
> Wolfgang Huber
> EMBL
> http://www.embl.de/research/units/genome_biology/huber/contact
>
>
>



More information about the Bioconductor mailing list