[BioC] Does the X_laevis_2 cdf show X_laevis gen 1 probes?

James W. MacDonald jmacdon at uw.edu
Tue Nov 13 00:12:55 CET 2012


Hi Philip,

On 11/12/2012 3:42 PM, Philip Cheung wrote:
> Hello bioconductor-help members,
>
> My apologies if this email has arrived in the wrong group.  I read
> over the posting guide and I wasn't sure if this was the correct forum
> for this question -- if it is not, please let me know and I will
> remove this posting and resubmit to the correct group.
>
> Finally, apologies for the long post, but to reproduce this problem
> requires several steps.

Actually it doesn't take any of those steps.

library(BiocInstaller)
biocLite(c("xlaevis2cdf","xlaevis2probe"))
library(xlaevis2cdf)
library(xlaevis2probe)
x <- ls(xlaevis2cdf)
y <- unique(as.character(xlaevis2probe$Probe.Set.Name))

sum(x %in% y)
160
length(x)
32635

So yeah. It appears the xlaevis2 cdf that I have been using is an older 
version, which contains lots of probesets with an Xl.xxx instead of the 
current version which has probesets like Xl2.xxx.

I have sent the corrected version to Marc Carlson, who is in charge of 
getting such things on the website and available to download. I assume 
this will happen in the next few days.

Thanks for pointing that out!

Best,

Jim

>
> Step 1) Download a sample X_laevis_2  CEL file from Geo
>
> I choose the experiment GSM579981
>
> http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSM579981
>
> Clicking on the platform shows that it is a X_laevis_2 experiment
> http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GPL10756
>
> For our sample file, I choose GSM579981.CEL.gz
>
> You can download this file at this url:
> http://www.ncbi.nlm.nih.gov/geosuppl/?acc=GSM579981&file=GSM579981%2ECEL%2Egz
>
> Unzip the file prior to reading it in.
>
>
> Step 2) Read In the CEL File and verify that it is an X_laevis_2 chip
>
>> library(affy)
> Loading required package: Biobase
>
> Welcome to Bioconductor
>
>    Vignettes contain introductory material. To view, type
>    'openVignette()'. To cite Bioconductor, see
>    'citation("Biobase")' and for packages 'citation(pkgname)'.
>
>> chipData<-ReadAffy("GSM579981.CEL")
> chipData
>> chipData
> AffyBatch object
> size of arrays=984x984 features (16 kb)
> cdf=X_laevis_2 (32635 affyids)
> number of samples=1
> number of genes=32635
> annotation=xlaevis2
> notes=
>
>
> Step 3) Examine the Probes for the sample -- I jump directly into the
> middle of the list to examine some probe names
>
>> probes<- indexProbes(chipData,which="pm")
>> probes[10000:10003]
> $Xl.22397.1.S1_at
>   [1]  28653 395253 596069 345939  37062 672483 385713 355738 658991 845464
> [11] 215644 808173 660480 355294
>
> $Xl.22404.1.S1_at
>   [1] 751168 802621 258978 278901 770602 722241 468139  41006 622884 434793
> [11] 552950 654955 717774 464009
>
> $Xl.22404.2.A1_at
>   [1] 403052 562419 290807 200686  31036 302164 198607 877595 238016 872893
> [11] 151357 662822 465642  50403
>
> $Xl.22413.2.S1_a_at
>   [1] 776460 314189 678641 883234 210521  74592 808408  54975 857096 600001
> [11] 298898 560070 544638 517362
>
>
> Step 4) Map the probes back to the Affy Reference files
>
> Examining the probe data from GeneChip Xenopus laevis Genome 2.0 Array
> http://www.affymetrix.com/browse/products.jsp?productId=131525&navMode=34000&navAction=jump&aId=productsNav#1_1
>
> These Xl.* probes are not in the affymetrix probe file:
> http://www.affymetrix.com/Auth/analysis/downloads/data/X_laevis_2.probe_tab.zip
> (May require Registration)
>
> grep -n2 "Xl.22404." X_laevis_2.probe_tab
>
> <  NOTHING FOUND>
>
>
>
> Instead, these probe names are actually in the Gen 1 -- GeneChip
> Xenopus laevis Genome Array
> http://www.affymetrix.com/esearch/search.jsp?pd=131526&N=4294967292
>
>
> https://www.affymetrix.com/Auth/analysis/downloads/data/Xenopus_laevis.probe_fasta.zip
> (May require Registration)
>
>
> dsanphil01-2:Downloads pcheung$ grep -n2 "Xl.22404." Xenopus_laevis.probe_fasta
> 239203->probe:Xenopus_laevis:Xl.22388.1.S1_at:328:541;
> Interrogation_Position=756; Antisense;
> 239204-GGCCATGACTAGAGTCGTTCTTCCC
> 239205:>probe:Xenopus_laevis:Xl.22404.1.S1_at:212:39;
> Interrogation_Position=150; Antisense;
> 239206-ATCTGGCCCAAACCTTTTGAAACAG
> 239207:>probe:Xenopus_laevis:Xl.22404.1.S1_at:177:541;
> Interrogation_Position=153; Antisense;
> 239208-TGGCCCAAACCTTTTGAAACAGCAC
> 239209:>probe:Xenopus_laevis:Xl.22404.1.S1_at:171:151;
> Interrogation_Position=159; Antisense;
> 239210-AAACCTTTTGAAACAGCACCCGATG
> 239211:>probe:Xenopus_laevis:Xl.22404.1.S1_at:616:169;
> Interrogation_Position=160; Antisense;
> 239212-AACCTTTTGAAACAGCACCCGATGC
> 239213:>probe:Xenopus_laevis:Xl.22404.1.S1_at:630:109;
> Interrogation_Position=161; Antisense;
> 239214-ACCTTTTGAAACAGCACCCGATGCA
> 239215:>probe:Xenopus_laevis:Xl.22404.1.S1_at:543:671;
> Interrogation_Position=164; Antisense;
> 239216-TTTTGAAACAGCACCCGATGCACTC
> 239217:>probe:Xenopus_laevis:Xl.22404.1.S1_at:365:659;
> Interrogation_Position=165; Antisense;
> 239218-TTTGAAACAGCACCCGATGCACTCC
> 239219:>probe:Xenopus_laevis:Xl.22404.1.S1_at:141:105;
> Interrogation_Position=176; Antisense;
> 239220-ACCCGATGCACTCCAGCTTTGCTGG
> 239221:>probe:Xenopus_laevis:Xl.22404.1.S1_at:271:273;
> Interrogation_Position=178; Antisense;
> 239222-CCGATGCACTCCAGCTTTGCTGGTT
> 239223:>probe:Xenopus_laevis:Xl.22404.1.S1_at:126:299;
> Interrogation_Position=43; Antisense;
> 239224-GCGGAGGGCTGTTCATGTTGCATAT
> 239225:>probe:Xenopus_laevis:Xl.22404.1.S1_at:105:267;
> Interrogation_Position=44; Antisense;
> 239226-CGGAGGGCTGTTCATGTTGCATATT
> 239227:>probe:Xenopus_laevis:Xl.22404.1.S1_at:395:517;
> Interrogation_Position=45; Antisense;
> 239228-GGAGGGCTGTTCATGTTGCATATTG
> 239229:>probe:Xenopus_laevis:Xl.22404.1.S1_at:128:71;
> Interrogation_Position=47; Antisense;
> 239230-AGGGCTGTTCATGTTGCATATTGGA
> 239231:>probe:Xenopus_laevis:Xl.22404.1.S1_at:233:489;
> Interrogation_Position=48; Antisense;
> 239232-GGGCTGTTCATGTTGCATATTGGAA
> 239233:>probe:Xenopus_laevis:Xl.22404.1.S1_at:239:301;
> Interrogation_Position=50; Antisense;
> 239234-GCTGTTCATGTTGCATATTGGAATT
> 239235:>probe:Xenopus_laevis:Xl.22404.1.S1_at:389:445;
> Interrogation_Position=85; Antisense;
> 239236-GTAGGGTGCATAATATGCACAAGTC
> 239237->probe:Xenopus_laevis:Xl.22710.1.S1_at:596:377;
> Interrogation_Position=1190; Antisense;
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

-- 
James W. MacDonald, M.S.
Biostatistician
University of Washington
Environmental and Occupational Health Sciences
4225 Roosevelt Way NE, # 100
Seattle WA 98105-6099



More information about the Bioconductor mailing list