[BioC] Does the X_laevis_2 cdf show X_laevis gen 1 probes?

Philip Cheung philip.p.cheung at gmail.com
Mon Nov 12 21:42:16 CET 2012


Hello bioconductor-help members,

My apologies if this email has arrived in the wrong group.  I read
over the posting guide and I wasn't sure if this was the correct forum
for this question -- if it is not, please let me know and I will
remove this posting and resubmit to the correct group.

Finally, apologies for the long post, but to reproduce this problem
requires several steps.

Step 1) Download a sample X_laevis_2  CEL file from Geo

I choose the experiment GSM579981

http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSM579981

Clicking on the platform shows that it is a X_laevis_2 experiment
http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GPL10756

For our sample file, I choose GSM579981.CEL.gz

You can download this file at this url:
http://www.ncbi.nlm.nih.gov/geosuppl/?acc=GSM579981&file=GSM579981%2ECEL%2Egz

Unzip the file prior to reading it in.


Step 2) Read In the CEL File and verify that it is an X_laevis_2 chip

> library(affy)
Loading required package: Biobase

Welcome to Bioconductor

  Vignettes contain introductory material. To view, type
  'openVignette()'. To cite Bioconductor, see
  'citation("Biobase")' and for packages 'citation(pkgname)'.

> chipData<-ReadAffy("GSM579981.CEL")
chipData
> chipData
AffyBatch object
size of arrays=984x984 features (16 kb)
cdf=X_laevis_2 (32635 affyids)
number of samples=1
number of genes=32635
annotation=xlaevis2
notes=


Step 3) Examine the Probes for the sample -- I jump directly into the
middle of the list to examine some probe names

> probes <- indexProbes(chipData,which="pm")

> probes[10000:10003]
$Xl.22397.1.S1_at
 [1]  28653 395253 596069 345939  37062 672483 385713 355738 658991 845464
[11] 215644 808173 660480 355294

$Xl.22404.1.S1_at
 [1] 751168 802621 258978 278901 770602 722241 468139  41006 622884 434793
[11] 552950 654955 717774 464009

$Xl.22404.2.A1_at
 [1] 403052 562419 290807 200686  31036 302164 198607 877595 238016 872893
[11] 151357 662822 465642  50403

$Xl.22413.2.S1_a_at
 [1] 776460 314189 678641 883234 210521  74592 808408  54975 857096 600001
[11] 298898 560070 544638 517362


Step 4) Map the probes back to the Affy Reference files

Examining the probe data from GeneChip Xenopus laevis Genome 2.0 Array
http://www.affymetrix.com/browse/products.jsp?productId=131525&navMode=34000&navAction=jump&aId=productsNav#1_1

These Xl.* probes are not in the affymetrix probe file:
http://www.affymetrix.com/Auth/analysis/downloads/data/X_laevis_2.probe_tab.zip
(May require Registration)

grep -n2 "Xl.22404." X_laevis_2.probe_tab

< NOTHING FOUND >



Instead, these probe names are actually in the Gen 1 -- GeneChip
Xenopus laevis Genome Array
http://www.affymetrix.com/esearch/search.jsp?pd=131526&N=4294967292


https://www.affymetrix.com/Auth/analysis/downloads/data/Xenopus_laevis.probe_fasta.zip
(May require Registration)


dsanphil01-2:Downloads pcheung$ grep -n2 "Xl.22404." Xenopus_laevis.probe_fasta
239203->probe:Xenopus_laevis:Xl.22388.1.S1_at:328:541;
Interrogation_Position=756; Antisense;
239204-GGCCATGACTAGAGTCGTTCTTCCC
239205:>probe:Xenopus_laevis:Xl.22404.1.S1_at:212:39;
Interrogation_Position=150; Antisense;
239206-ATCTGGCCCAAACCTTTTGAAACAG
239207:>probe:Xenopus_laevis:Xl.22404.1.S1_at:177:541;
Interrogation_Position=153; Antisense;
239208-TGGCCCAAACCTTTTGAAACAGCAC
239209:>probe:Xenopus_laevis:Xl.22404.1.S1_at:171:151;
Interrogation_Position=159; Antisense;
239210-AAACCTTTTGAAACAGCACCCGATG
239211:>probe:Xenopus_laevis:Xl.22404.1.S1_at:616:169;
Interrogation_Position=160; Antisense;
239212-AACCTTTTGAAACAGCACCCGATGC
239213:>probe:Xenopus_laevis:Xl.22404.1.S1_at:630:109;
Interrogation_Position=161; Antisense;
239214-ACCTTTTGAAACAGCACCCGATGCA
239215:>probe:Xenopus_laevis:Xl.22404.1.S1_at:543:671;
Interrogation_Position=164; Antisense;
239216-TTTTGAAACAGCACCCGATGCACTC
239217:>probe:Xenopus_laevis:Xl.22404.1.S1_at:365:659;
Interrogation_Position=165; Antisense;
239218-TTTGAAACAGCACCCGATGCACTCC
239219:>probe:Xenopus_laevis:Xl.22404.1.S1_at:141:105;
Interrogation_Position=176; Antisense;
239220-ACCCGATGCACTCCAGCTTTGCTGG
239221:>probe:Xenopus_laevis:Xl.22404.1.S1_at:271:273;
Interrogation_Position=178; Antisense;
239222-CCGATGCACTCCAGCTTTGCTGGTT
239223:>probe:Xenopus_laevis:Xl.22404.1.S1_at:126:299;
Interrogation_Position=43; Antisense;
239224-GCGGAGGGCTGTTCATGTTGCATAT
239225:>probe:Xenopus_laevis:Xl.22404.1.S1_at:105:267;
Interrogation_Position=44; Antisense;
239226-CGGAGGGCTGTTCATGTTGCATATT
239227:>probe:Xenopus_laevis:Xl.22404.1.S1_at:395:517;
Interrogation_Position=45; Antisense;
239228-GGAGGGCTGTTCATGTTGCATATTG
239229:>probe:Xenopus_laevis:Xl.22404.1.S1_at:128:71;
Interrogation_Position=47; Antisense;
239230-AGGGCTGTTCATGTTGCATATTGGA
239231:>probe:Xenopus_laevis:Xl.22404.1.S1_at:233:489;
Interrogation_Position=48; Antisense;
239232-GGGCTGTTCATGTTGCATATTGGAA
239233:>probe:Xenopus_laevis:Xl.22404.1.S1_at:239:301;
Interrogation_Position=50; Antisense;
239234-GCTGTTCATGTTGCATATTGGAATT
239235:>probe:Xenopus_laevis:Xl.22404.1.S1_at:389:445;
Interrogation_Position=85; Antisense;
239236-GTAGGGTGCATAATATGCACAAGTC
239237->probe:Xenopus_laevis:Xl.22710.1.S1_at:596:377;
Interrogation_Position=1190; Antisense;



More information about the Bioconductor mailing list