[BioC] xps rma() with HuGene-1_0-st-v1 on 64-bit architecture

cstrato cstrato at aon.at
Tue Feb 17 20:05:42 CET 2009


Dear Tim,

First, I am glad to hear that my package works on 64-bit OS w/o problems.

Luckily, the solution to your problem is simple. Please use the 
following pgf and clf files in your code to create xps.scheme:
- HuGene-1_0-st-v1.r3.clf
- HuGene-1_0-st-v1.r3.pgf

The reason is as follows:
About two weeks ago Affymetrix has updated the pgf file to allow 
customers to use HuGene as a cheaper exon array. For this purpose, they 
have created an additional "HuGene-1_0-st-v1.na27.hg18.probeset.csv" 
file and have changed the probesets in the *.pgf file. Instead of 
"transcript_cluster_id" the probes are now mapped to "probeset_id" of 
the new probeset annotation file. For this reason xps recognizes only 
the 57 affx-controls when parsing the *.pgf file, and thus only these 57 
controls will be summarized.

I am currently in the process to update my package to allow using HuGene 
arrays as exon arrays, and I will inform you once I have uploaded the 
new version. Until then I must ask you to use the older *.r3.pgf file.

Best regards
Christian
_._._._._._._._._._._._._._._._._._
C.h.r.i.s.t.i.a.n   S.t.r.a.t.o.w.a
V.i.e.n.n.a           A.u.s.t.r.i.a
e.m.a.i.l:        cstrato at aon.at
_._._._._._._._._._._._._._._._._._


Tim Rayner wrote:
> Hi,
>
> I'm seeing what appears to be odd behaviour from the xps rma() method
> when trying to summarize a small test dataset from the
> HuGene-1_0-st-v1 array. The oddness is that whatever options I pass to
> rma(), I only ever get summary data for 57 probe sets back (obviously
> I'd expect rather more than that).
>
> I'm using 64-bit Mac OSX, and I believe I've installed everything
> correctly and imported the probe annotation from the latest chip
> library files on Affy's web site. I did have to compile ROOT from
> source to support the 64-bit architecture, but that went pretty
> smoothly. After some hours of poking through the xps code I'm a little
> suspicious about the probe masking, but not much wiser, I'm afraid.
>
> I should just briefly mention that I can run rma over the same data
> set by using the oligo package, so I think the data files are fine.
>
> Attached is a sample session, which I've just run from scratch to
> confirm the problem, and my sessionInfo. I'm wondering if anyone else
> has seen this, or if I've just made some fundamental error.
>
> Many thanks,
>
> Tim Rayner
>
>
>
> #############################################
> ## sessionInfo():
>
>   
>> sessionInfo()
>>     
> R version 2.8.1 Patched (2009-01-19 r47650)
> i386-apple-darwin9.6.0
>
> locale:
> en_GB.UTF-8/en_GB.UTF-8/C/C/en_GB.UTF-8/en_GB.UTF-8
>
> attached base packages:
> [1] tools     stats     graphics  grDevices utils     datasets  methods
> [8] base
>
> other attached packages:
> [1] Biobase_2.2.2 xps_1.2.5
>
> loaded via a namespace (and not attached):
> [1] tcltk_2.8.1
>
>
>
> ##############################################
> ## Session commands:
> library('xps')
> celdir=getwd()
> celfiles=list.files(pattern='.*.CEL')
> libdir <- '/Users/tfr23/Documents/resources/HuGene-1_0/'
> xps.scheme <- import.genome.scheme(filename='HuGene-1_0-st-v1-r4',
>                                    filedir=libdir,
>                                    layoutfile=paste(libdir,
>                                      'HuGene-1_0-st-v1.r4.clf',
>                                      sep=''),
>                                    schemefile=paste(libdir,
>                                      'HuGene-1_0-st-v1.r4.pgf',
>                                      sep=''),
>                                    transcript=paste(libdir,
>
> 'HuGene-1_0-st-v1.na27.hg18.transcript.csv',
>                                      sep=''),
>                                    verbose=TRUE)
>
> xps.cel<-import.data(xps.scheme, 'HuGeneCelData', celdir=celdir,
> celfiles=celfiles)
>
> xps.cel<-attachInten(xps.cel)
>
> xps.rma <- rma(xps.cel,
>                filename='HuGeneMixRMAMetacore',
>                exonlevel='metacore+affx',
>                background='antigenomic',
>                normalize=TRUE)
>
> ######################################
> ## Session output:
>
> Welcome to xps version 1.2.5
>     an R wrapper for XPS - eXpression Profiling System
>     (c) Copyright 2001-2009 by Christian Stratowa
>
> Creating new file
> </Users/tfr23/Documents/resources/HuGene-1_0/HuGene-1_0-st-v1-r4.root>...
> Importing </Users/tfr23/Documents/resources/HuGene-1_0/HuGene-1_0-st-v1.r4.clf>
> as <HuGene-1_0-st-v1.cxy>...
>    <1102500> records imported...Finished
> New dataset <HuGene-1_0-st-v1> is added to Content...
> Importing </Users/tfr23/Documents/resources/HuGene-1_0/HuGene-1_0-st-v1.na27.hg18.transcript.csv>
> as <HuGene-1_0-st-v1.ann>...
>    Number of transcripts is <33297>.
>    <33297> records read...Finished
>    <33297> records imported...Finished
> Importing </Users/tfr23/Documents/resources/HuGene-1_0/HuGene-1_0-st-v1.r4.pgf>
> as <HuGene-1_0-st-v1.scm>...
>    Reading data from input file...
>    Number of probesets is <257430>.
> Note: Number of annotated probesets <33297> is not equal to number of
> probesets <257430>.
>    <257430> records read...Finished
>    Sorting data for probeset_type and position...
>    Total number of controls is <4371>
>    Note: no data for probeset type: control->chip...
>    Filling trees with data for probeset type: normgene, rescue...
>    Filling trees with data for probeset type: control->bgp...
>    Filling trees with data for probeset type: control->affx...
>    <33252> probeset tree entries read...Finished
>    Number of control->affx probesets is <57>.
>    Filling trees with data for probeset type: main...
>    Filling trees with data for non-annotated probesets...
>    <861493> records imported...Finished
>    <257430> total transcript units imported.
>    Genome cell statistics:
>       Number of unit cells: minimum = 1,  maximum = 1189
> Opening file </Users/tfr23/Documents/resources/HuGene-1_0/HuGene-1_0-st-v1-r4.root>
> in <READ> mode...
> Creating new file </Users/tfr23/Documents/affytest/HuGeneCelData_cel.root>...
> Importing </Users/tfr23/Documents/affytest/Affy 0104 - 020206A CD8 -
> 090213.CEL> as <Affy 0104 - 020206A CD8 - 090213.cel>...
>    hybridization statistics:
>       1 cells with minimal intensity 23
>       1 cells with maximal intensity 35735
> New dataset <DataSet> is added to Content...
> Importing </Users/tfr23/Documents/affytest/Affy 0104 - 020305 CD8 -
> 090213.CEL> as <Affy 0104 - 020305 CD8 - 090213.cel>...
>    hybridization statistics:
>       2 cells with minimal intensity 20
>       1 cells with maximal intensity 24768
> Importing </Users/tfr23/Documents/affytest/Affy 0104 - 030804 CD8 -
> 090213.CEL> as <Affy 0104 - 030804 CD8 - 090213.cel>...
>    hybridization statistics:
>       6 cells with minimal intensity 25
>       1 cells with maximal intensity 38526
> Importing </Users/tfr23/Documents/affytest/Affy 0104 - 040107 CD8 -
> 090213.CEL> as <Affy 0104 - 040107 CD8 - 090213.cel>...
>    hybridization statistics:
>       2 cells with minimal intensity 22
>       1 cells with maximal intensity 20150
> Importing </Users/tfr23/Documents/affytest/Affy 0104 - 061004 CD8 -
> 090213.CEL> as <Affy 0104 - 061004 CD8 - 090213.cel>...
>    hybridization statistics:
>       2 cells with minimal intensity 20
>       1 cells with maximal intensity 21650
> Importing </Users/tfr23/Documents/affytest/Affy 0104 - 070205 CD8 -
> 090213.CEL> as <Affy 0104 - 070205 CD8 - 090213.cel>...
>    hybridization statistics:
>       2 cells with minimal intensity 21
>       1 cells with maximal intensity 23005
> Importing </Users/tfr23/Documents/affytest/Affy 0104 - 090305 CD8 -
> 090213.CEL> as <Affy 0104 - 090305 CD8 - 090213.cel>...
>    hybridization statistics:
>       22 cells with minimal intensity 21
>       1 cells with maximal intensity 21205
> Importing </Users/tfr23/Documents/affytest/Affy 0104 - 110806B CD8 -
> 090213.CEL> as <Affy 0104 - 110806B CD8 - 090213.cel>...
>    hybridization statistics:
>       1 cells with minimal intensity 21
>       1 cells with maximal intensity 22958
> Importing </Users/tfr23/Documents/affytest/Affy 0104 - 150107 CD8 -
> 090213.CEL> as <Affy 0104 - 150107 CD8 - 090213.cel>...
>    hybridization statistics:
>       2 cells with minimal intensity 19
>       1 cells with maximal intensity 23606
> Importing </Users/tfr23/Documents/affytest/Affy 0104 - 150405 CD8 -
> 090213.CEL> as <Affy 0104 - 150405 CD8 - 090213.cel>...
>    hybridization statistics:
>       4 cells with minimal intensity 24
>       1 cells with maximal intensity 24268
> Importing </Users/tfr23/Documents/affytest/Affy 0104 - 190706 CD8 -
> 090213.CEL> as <Affy 0104 - 190706 CD8 - 090213.cel>...
>    hybridization statistics:
>       6 cells with minimal intensity 21
>       1 cells with maximal intensity 22769
> Importing </Users/tfr23/Documents/affytest/Affy 0104 - 300605 CD8 -
> 090213.CEL> as <Affy 0104 - 300605 CD8 - 090213.cel>...
>    hybridization statistics:
>       2 cells with minimal intensity 20
>       1 cells with maximal intensity 22309
> Importing </Users/tfr23/Documents/affytest/Affy 0104 -040205 CD8 -
> 090213.CEL> as <Affy 0104 -040205 CD8 - 090213.cel>...
>    hybridization statistics:
>       1 cells with minimal intensity 23
>       1 cells with maximal intensity 22497
> Creating new file </Users/tfr23/Documents/affytest/HuGeneMixRMAMetacore.root>...
> Opening file </Users/tfr23/Documents/resources/HuGene-1_0/HuGene-1_0-st-v1-r4.root>
> in <READ> mode...
> Preprocessing data using method <preprocess>...
>    Background correcting raw data...
>       setting selector mask for typepm <8252>
>       calculating background for <Affy 0104 - 020206A CD8 - 090213.cel>...
>       background statistics:
>          1097995 cells with minimal intensity 0
>          1378 cells with maximal intensity 151.284
>       calculating background for <Affy 0104 - 020305 CD8 - 090213.cel>...
>       background statistics:
>          1097995 cells with minimal intensity 0
>          2 cells with maximal intensity 75.9992
>       calculating background for <Affy 0104 - 030804 CD8 - 090213.cel>...
>       background statistics:
>          1097995 cells with minimal intensity 0
>          28 cells with maximal intensity 122.454
>       calculating background for <Affy 0104 - 040107 CD8 - 090213.cel>...
>       background statistics:
>          1097995 cells with minimal intensity 0
>          13 cells with maximal intensity 154.02
>       calculating background for <Affy 0104 - 061004 CD8 - 090213.cel>...
>       background statistics:
>          1097995 cells with minimal intensity 0
>          47 cells with maximal intensity 101.165
>       calculating background for <Affy 0104 - 070205 CD8 - 090213.cel>...
>       background statistics:
>          1097995 cells with minimal intensity 0
>          25 cells with maximal intensity 94.408
>       calculating background for <Affy 0104 - 090305 CD8 - 090213.cel>...
>       background statistics:
>          1097995 cells with minimal intensity 0
>          220 cells with maximal intensity 52.9483
>       calculating background for <Affy 0104 - 110806B CD8 - 090213.cel>...
>       background statistics:
>          1097995 cells with minimal intensity 0
>          97 cells with maximal intensity 136.739
>       calculating background for <Affy 0104 - 150107 CD8 - 090213.cel>...
>       background statistics:
>          1097995 cells with minimal intensity 0
>          1055 cells with maximal intensity 105.265
>       calculating background for <Affy 0104 - 150405 CD8 - 090213.cel>...
>       background statistics:
>          1097995 cells with minimal intensity 0
>          36 cells with maximal intensity 128.385
>       calculating background for <Affy 0104 - 190706 CD8 - 090213.cel>...
>       background statistics:
>          1097995 cells with minimal intensity 0
>          957 cells with maximal intensity 135.396
>       calculating background for <Affy 0104 - 300605 CD8 - 090213.cel>...
>       background statistics:
>          1097995 cells with minimal intensity 0
>          865 cells with maximal intensity 49.4309
>       calculating background for <Affy 0104 -040205 CD8 - 090213.cel>...
>       background statistics:
>          1097995 cells with minimal intensity 0
>          650 cells with maximal intensity 140.053
>    Normalizing raw data...
>       normalizing data using method <quantile>...
>       setting selector mask for typepm <8252>
>          finished filling <13> arrays.           90213>...
>          finished filling <13> trees.          090213.cqu>...
>    Converting raw data to expression levels...
>       summarizing with <medianpolish>...
>       setting selector mask for typepm <8252>
>       setting selector mask for typepm <8252>
>       calculating expression for <57> of <257430> units...Finished.
>       expression statistics:
>          minimal expression level is <19.8498>
>          maximal expression level is <8953.24>
>    preprocessing finished.
> Opening file </Users/tfr23/Documents/resources/HuGene-1_0/HuGene-1_0-st-v1-r4.root>
> in <READ> mode...
> Opening file </Users/tfr23/Documents/affytest/HuGeneMixRMAMetacore.root>
> in <READ> mode...
> Exporting data from tree <*> to file
> </Users/tfr23/Documents/affytest/HuGeneMixRMAMetacore.txt>...
> Reading entries from <HuGene-1_0-st-v1.ann> ...Finished
> <57> of <57> records exported.
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>
>
>



More information about the Bioconductor mailing list