[BioC] ath1121501probe_1.0 error (was GCRMA missing value error on ATH1 chip)

Matthew Hannah Hannah at mpimp-golm.mpg.de
Fri Feb 13 12:59:48 MET 2004


I've investigated this some more and found that the ATH1-121501_probe_tab.zip
file from the affy website contains 251,121 sequences whilst the CEL files and
the ATH1-121501_probe_fasta.zip only contain 251,078 probes. It therefore seems
that the errors were there in the tab file before the BioC ath1121501probe 
package was made. I've emailed  affymetrix about it but don't expect a quick 
response judging from past queries.

So does anyone know how to find the extra values in the tab file? It doesn't 
look like there are simply extra values added at the start or finish. Does anyone
familiar with R know how to obtain a list of Affy ID vs. # of probes from the
ath1121501probe package or by reading in the ATH1-121501_probe_tab file. This 
would be easy to cross-reference with the Affy ID vs. probe number that you get
from the CEL file during MAS5 analysis.

Has this been an issue for any other chips, are we just trusting affymetrix to
provide the correct sequence data? I've seen some data showing that ~700 ATH1
probesets don't match their intended target when an independent BLAST was done.


> there seems to be a disagreement on how many pm probes there are on the
>chip. This is causing problem in matching the pm intensities with
>sequences. I am not sure if this is true for all ATH1 chip...
>  After reading in your Cel file into "object",
>  pmIndex <-  unlist(indexProbes(object,"pm"))
>  length(pmIndex)
>  #[1]251078 
>  #however the probe package gives 251121 pm probe sequences.
>  length(get("ath1121501probe")$sequence)
>  [1] 251121
>  right now I am not sure which should be fixed-- whether the probe
>package has some redundent sequences that are not PM probes or the
>indexProbes missed some pm probes?

More information about the Bioconductor mailing list