[BioC] homology package question

James W. MacDonald jmacdon at med.umich.edu
Wed Apr 18 22:55:12 CEST 2007


Hi Nianhua,

Nianhua Li wrote:
> Hi, James,
> 
> The source file of mmuhomology is  
> ftp://ftp.ncbi.nih.gov/pub/HomoloGene/current/hmlg.ftp (download on 02/28/2007)
> and the description is
> ftp://ftp.ncbi.nih.gov/pub/HomoloGene/README-old
> 
> According to the description, the 4th and 7th column of hmlg.ftp are Entrez Gene
> ID, the 5th and 8th column are internal HomoloGene ID. If you look at the
> hmlg.ftp file, even the current one, you can find that the internal HomoloGene
> ID is the same as Entrez Gene ID for most of the case. That's why
> mmuhomologyHGID2LL and mmuhomologyLL2HGID look identical. 

Odd. I wonder if they no longer even check to see if the data are 
correct. I checked several of the IDs, and AFAIK they really are Entrez 
Gene IDs, and they really are not HomoloGene IDs.

Anyway, it's really easy to get the mappings from biomaRt so that might 
be the direction to point people until we start using an updated source 
of these data.

Best,

Jim


> 
> I think we should update the homology packages in the near future to use another
> source data because the README file on this site says:
> 
> "The old HomoloGene FTP file formats (hmlg.ftp and hmlg.trip.ftp) are now
> deprecated.  They will be produced for the time being, to make the
> transition to the new file formats smoother, but will be discontinued 
> as of Jan. 1, 2007."
> 
> But we don't have time to make the changes for this release. Sorry...
> 
> best
> 
> nianhua
> 
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor


-- 
James W. MacDonald, M.S.
Biostatistician
Affymetrix and cDNA Microarray Core
University of Michigan Cancer Center
1500 E. Medical Center Drive
7410 CCGC
Ann Arbor MI 48109
734-647-5623


**********************************************************
Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues.



More information about the Bioconductor mailing list