[BioC] Source files for Agilent/Illumina Array packages?

Marc Carlson mcarlson at fhcrc.org
Fri Feb 4 23:22:00 CET 2011


Hi Denise,

The file used to connect the probe_ids to the entrez gene ID always
comes from the manufacturer.  I have no way of knowing why this
particular manufacturer has reassigned the probe_id below between the
two different platforms but that appears to be what has happened. 
Sometimes gene assignments for a probe will change as we learn more
about the underlying genome that it was based on and the manufacturer
files can also change when this happens to reflect that.

An important note about mgug4121a  is that Agilent has not, (that I can
find), updated the annotations for the mgug4121a package for quite a
while.  And that makes me unhappy, because it means that the official
"latest" annotations for them are still pretty old.  Your finding
indicates that perhaps they should update these files and make them
available but given the age of the platform, I am not holding my breath
on that one.  So now I am thinking that if I want to keep some of these
really old Agilent packages around that I might have to put a warning
label on them as their annotations appear to be falling slowly more and
more out of date.  :(

For these two platforms, the newer one (MmAgilentDesign026655) should be
the obvious winner.  You can get the most recent files for this from the
earray service at Agilent:

https://earray.chem.agilent.com/earray/

The Microarray name that comes up when you search for this design at
Agilent is Whole Mouse Genome Microarray 4x44K v2 .

GPL10333


I hope that this helps you,



  Marc





On 02/03/2011 11:26 AM, Denise Mauldin wrote:
> Hello all,
>
> As part of my attempt to connect the Agilent/Illumina array packages with
> their GPL IDs, I was wondering if someone could tell me how to get the
> information on what the source file is for a package?   For example, I'd
> like to know what the source file is for the package
> 'MmAgilentDesign026655.db' so that I can figure out if that corresponds to
> the GEO GPL ID 11202 or not.
>
> In addition, I have a mismatch in my data for a particular probe that one of
> my researchers is interested in:
>
> select * from agilent_probe where probe_id = 'A_51_P268529';
> +-----------------------+--------------+-------------+-----------------+
> | chip_id               | probe_id     | sequence_id | entrez_gene_eid |
> +-----------------------+--------------+-------------+-----------------+
> | mgug4121a             | A_51_P268529 | AF045741    |          13423  |
> | MmAgilentDesign026655 | A_51_P268529 | NM_144942   |         246277  |
> +-----------------------+--------------+-------------+-----------------+
>
> I'm presuming that this is in the source file used to generate the
> Bioconductor package, but I'd like to check that is the case.  What do
> people usually do about mismatches like this?
>
> Thanks,
> Denise
>
> 	[[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>



More information about the Bioconductor mailing list