[BioC] Unique probes on all human affy chips

Robert Gentleman rgentlem at jimmy.harvard.edu
Sun Sep 19 19:15:03 CEST 2004


On Sun, Sep 19, 2004 at 05:49:08PM +0100, Adaikalavan Ramasamy wrote:
> You have not told us what type of data you are looking for ? Do you want
> the probe sequences, genbank identifiers or merely affymetrix ids ? Do
> you know what Cel Definition File aka CDF is ? For brief explanation,
> see http://www.bioconductor.org/data/cdfenvs/desc.html
> 
> I do not know how to do this in BioConductor. But if I need the
> annotation information, I get it directly from affymetrix website. 
> 

  Yes, but I think that the question was not about a single chip, but
  rather about all chips - and I don't think that netaffx helps you
  with that, you need to do some computation.

  I believe that the question is about 25mers, and in that case
  dumping the cdf files (either from BioC or netaffx) and loading them
  into a database is one step, from there I would rely on the merge
  capabilities of the database.

  Robert



> 1) Select the human chip you want from 
> http://www.affymetrix.com/support/technical/byproduct.affx?cat=arrays&Human
> 2) Find the section called "NetAffx Annotation Files" or "Sequence
> Files" and select the format/file you want
> 3) At this stage you will be asked for login. Registration is free.
> 
> Suppose you have download the information of interest for all human
> arrays, then you can remove redundancies by using the AffymetrixID which
> is unique identifier of probesets. 
> 
> 
> 
> On Sun, 2004-09-19 at 15:59, S Peri wrote:
> > Dear group, 
> >  Is there any place where I can get all the unique
> > probe ids for all the Affy human chips (~13 chips). 
> > I am trying to get the unique probes (no duplicates).
> > It turned out to be very computing intensive problem. 
> > I took all the probes from all 13 chips and made a
> > program that writes the the unique id (if there are
> > duplicates, for e.g. 64474_g_at is there on HG-U95C,
> > HGU133, HGU133A2, and 133_plus2. In this case my
> > program will write 64474_g_at once in my output).
> > Using c++ code it is running for the last 20 hrs. I
> > made sure there are no bad loops that would put me in
> > infinite loop situation. It would be nice to have all
> > the uniqe ids in some place where i can use them
> > directly for my annotation purposes. 
> > 
> > Thanks
> > Peri.
> > 
> > _______________________________________________
> > Bioconductor mailing list
> > Bioconductor at stat.math.ethz.ch
> > https://stat.ethz.ch/mailman/listinfo/bioconductor
> >
> 
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor

-- 
+---------------------------------------------------------------------------+
| Robert Gentleman                 phone : (617) 632-5250                   |
| Associate Professor              fax:   (617)  632-2444                   |
| Department of Biostatistics      office: M1B20                            |
| Harvard School of Public Health  email: rgentlem at jimmy.harvard.edu        |
+---------------------------------------------------------------------------+



More information about the Bioconductor mailing list