[BioC] HuGene as exon array (was: xps rma() with HuGene-1_0-st-v1 on 64-bit architecture)

Tim Rayner tfrayner at gmail.com
Thu Feb 26 10:47:04 CET 2009


Hi Christian,

Thanks very much indeed. I can confirm that xps 1.2.6 certainly works
as advertised, and produces output comparable with what I've been
getting with APT. I'll let you know if I encounter any problems, but
so far I've not found any!

Cheers,

Tim

2009/2/24 cstrato <cstrato at aon.at>:
> Dear Tim,
>
> I am glad to inform you that a new version of xps is now available from BioC
> (xps_1.2.6 and xps_1.3.6), and I would very  much appreciate if you could
> test the new version.
>
> Please note that release 4 (r4) of the HuGene array converts it to an exon
> array, so you need to create the scheme as follows:
>
> xps.scheme <-
> import.exon.scheme("Scheme_HuGene10stv1r4_na27_2",filedir=scmdir,
>
> layoutfile=paste(libdir,"HuGene-1_0-st-v1.r4.analysis-lib-files/HuGene-1_0-st-v1.r4.clf",sep="/"),
>
> schemefile=paste(libdir,"HuGene-1_0-st-v1.r4.analysis-lib-files/HuGene-1_0-st-v1.r4.pgf",sep="/"),
>
> probeset=paste(anndir,"Version09Feb/HuGene-1_0-st-v1.na27.2.hg18.probeset.csv",sep="/"),
>
> transcript=paste(anndir,"Version09Feb/HuGene-1_0-st-v1.na27.hg18.transcript.csv",sep="/"))
>
>
> If you summarize the data on the transcript level you should get identical
> results as before:
>
> xps.rma <- rma(xps.cel, "HuGeneMixRMAcore", background="antigenomic",
>              option="transcript", exonlevel="core+affx")
>
>
> In addition, you can now summarize the data on the probeset (exon) level:
>
> xps.rma.ps <- rma(xps.cel, "HuGeneMixRMAcorePS", background="antigenomic",
>                 option="probeset", exonlevel="core+affx")
>
>
> Please let me know if the new version works as expected.
>
> Best regards
> Christian
>
>
> Tim Rayner wrote:
>>
>> Dear Christian,
>>
>> Thank you very much for your help - reverting to the older r3 files
>> does indeed solve the problem. I'll look forward to hearing about the
>> new version of the xps package, and I'd be more than happy to help
>> test it if needed.
>>
>> Best regards,
>>
>> Tim
>>
>> 2009/2/17 cstrato <cstrato at aon.at>:
>>
>>>
>>> Dear Tim,
>>>
>>> First, I am glad to hear that my package works on 64-bit OS w/o problems.
>>>
>>> Luckily, the solution to your problem is simple. Please use the following
>>> pgf and clf files in your code to create xps.scheme:
>>> - HuGene-1_0-st-v1.r3.clf
>>> - HuGene-1_0-st-v1.r3.pgf
>>>
>>> The reason is as follows:
>>> About two weeks ago Affymetrix has updated the pgf file to allow
>>> customers
>>> to use HuGene as a cheaper exon array. For this purpose, they have
>>> created
>>> an additional "HuGene-1_0-st-v1.na27.hg18.probeset.csv" file and have
>>> changed the probesets in the *.pgf file. Instead of
>>> "transcript_cluster_id"
>>> the probes are now mapped to "probeset_id" of the new probeset annotation
>>> file. For this reason xps recognizes only the 57 affx-controls when
>>> parsing
>>> the *.pgf file, and thus only these 57 controls will be summarized.
>>>
>>> I am currently in the process to update my package to allow using HuGene
>>> arrays as exon arrays, and I will inform you once I have uploaded the new
>>> version. Until then I must ask you to use the older *.r3.pgf file.
>>>
>>> Best regards
>>> Christian
>>> _._._._._._._._._._._._._._._._._._
>>> C.h.r.i.s.t.i.a.n   S.t.r.a.t.o.w.a
>>> V.i.e.n.n.a           A.u.s.t.r.i.a
>>> e.m.a.i.l:        cstrato at aon.at
>>> _._._._._._._._._._._._._._._._._._
>>>
>>>
>>> Tim Rayner wrote:
>>>
>>>>
>>>> Hi,
>>>>
>>>> I'm seeing what appears to be odd behaviour from the xps rma() method
>>>> when trying to summarize a small test dataset from the
>>>> HuGene-1_0-st-v1 array. The oddness is that whatever options I pass to
>>>> rma(), I only ever get summary data for 57 probe sets back (obviously
>>>> I'd expect rather more than that).
>>>>
>>>> I'm using 64-bit Mac OSX, and I believe I've installed everything
>>>> correctly and imported the probe annotation from the latest chip
>>>> library files on Affy's web site. I did have to compile ROOT from
>>>> source to support the 64-bit architecture, but that went pretty
>>>> smoothly. After some hours of poking through the xps code I'm a little
>>>> suspicious about the probe masking, but not much wiser, I'm afraid.
>>>>
>>>> I should just briefly mention that I can run rma over the same data
>>>> set by using the oligo package, so I think the data files are fine.
>>>>
>>>> Attached is a sample session, which I've just run from scratch to
>>>> confirm the problem, and my sessionInfo. I'm wondering if anyone else
>>>> has seen this, or if I've just made some fundamental error.
>>>>
>>>> Many thanks,
>>>>
>>>> Tim Rayner
>>>>
>>>>
>>>>
>>>> #############################################
>>>> ## sessionInfo():
>>>>
>>>>
>>>>
>>>>>
>>>>> sessionInfo()
>>>>>
>>>>>
>>>>
>>>> R version 2.8.1 Patched (2009-01-19 r47650)
>>>> i386-apple-darwin9.6.0
>>>>
>>>> locale:
>>>> en_GB.UTF-8/en_GB.UTF-8/C/C/en_GB.UTF-8/en_GB.UTF-8
>>>>
>>>> attached base packages:
>>>> [1] tools     stats     graphics  grDevices utils     datasets  methods
>>>> [8] base
>>>>
>>>> other attached packages:
>>>> [1] Biobase_2.2.2 xps_1.2.5
>>>>
>>>> loaded via a namespace (and not attached):
>>>> [1] tcltk_2.8.1
>>>>
>>>>
>>>>
>>>> ##############################################
>>>> ## Session commands:
>>>> library('xps')
>>>> celdir=getwd()
>>>> celfiles=list.files(pattern='.*.CEL')
>>>> libdir <- '/Users/tfr23/Documents/resources/HuGene-1_0/'
>>>> xps.scheme <- import.genome.scheme(filename='HuGene-1_0-st-v1-r4',
>>>>                                  filedir=libdir,
>>>>                                  layoutfile=paste(libdir,
>>>>                                    'HuGene-1_0-st-v1.r4.clf',
>>>>                                    sep=''),
>>>>                                  schemefile=paste(libdir,
>>>>                                    'HuGene-1_0-st-v1.r4.pgf',
>>>>                                    sep=''),
>>>>                                  transcript=paste(libdir,
>>>>
>>>> 'HuGene-1_0-st-v1.na27.hg18.transcript.csv',
>>>>                                    sep=''),
>>>>                                  verbose=TRUE)
>>>>
>>>> xps.cel<-import.data(xps.scheme, 'HuGeneCelData', celdir=celdir,
>>>> celfiles=celfiles)
>>>>
>>>> xps.cel<-attachInten(xps.cel)
>>>>
>>>> xps.rma <- rma(xps.cel,
>>>>              filename='HuGeneMixRMAMetacore',
>>>>              exonlevel='metacore+affx',
>>>>              background='antigenomic',
>>>>              normalize=TRUE)
>>>>
>>>> ######################################
>>>> ## Session output:
>>>>
>>>> Welcome to xps version 1.2.5
>>>>   an R wrapper for XPS - eXpression Profiling System
>>>>   (c) Copyright 2001-2009 by Christian Stratowa
>>>>
>>>> Creating new file
>>>>
>>>> </Users/tfr23/Documents/resources/HuGene-1_0/HuGene-1_0-st-v1-r4.root>...
>>>> Importing
>>>> </Users/tfr23/Documents/resources/HuGene-1_0/HuGene-1_0-st-v1.r4.clf>
>>>> as <HuGene-1_0-st-v1.cxy>...
>>>>  <1102500> records imported...Finished
>>>> New dataset <HuGene-1_0-st-v1> is added to Content...
>>>> Importing
>>>>
>>>> </Users/tfr23/Documents/resources/HuGene-1_0/HuGene-1_0-st-v1.na27.hg18.transcript.csv>
>>>> as <HuGene-1_0-st-v1.ann>...
>>>>  Number of transcripts is <33297>.
>>>>  <33297> records read...Finished
>>>>  <33297> records imported...Finished
>>>> Importing
>>>> </Users/tfr23/Documents/resources/HuGene-1_0/HuGene-1_0-st-v1.r4.pgf>
>>>> as <HuGene-1_0-st-v1.scm>...
>>>>  Reading data from input file...
>>>>  Number of probesets is <257430>.
>>>> Note: Number of annotated probesets <33297> is not equal to number of
>>>> probesets <257430>.
>>>>  <257430> records read...Finished
>>>>  Sorting data for probeset_type and position...
>>>>  Total number of controls is <4371>
>>>>  Note: no data for probeset type: control->chip...
>>>>  Filling trees with data for probeset type: normgene, rescue...
>>>>  Filling trees with data for probeset type: control->bgp...
>>>>  Filling trees with data for probeset type: control->affx...
>>>>  <33252> probeset tree entries read...Finished
>>>>  Number of control->affx probesets is <57>.
>>>>  Filling trees with data for probeset type: main...
>>>>  Filling trees with data for non-annotated probesets...
>>>>  <861493> records imported...Finished
>>>>  <257430> total transcript units imported.
>>>>  Genome cell statistics:
>>>>     Number of unit cells: minimum = 1,  maximum = 1189
>>>> Opening file
>>>> </Users/tfr23/Documents/resources/HuGene-1_0/HuGene-1_0-st-v1-r4.root>
>>>> in <READ> mode...
>>>> Creating new file
>>>> </Users/tfr23/Documents/affytest/HuGeneCelData_cel.root>...
>>>> Importing </Users/tfr23/Documents/affytest/Affy 0104 - 020206A CD8 -
>>>> 090213.CEL> as <Affy 0104 - 020206A CD8 - 090213.cel>...
>>>>  hybridization statistics:
>>>>     1 cells with minimal intensity 23
>>>>     1 cells with maximal intensity 35735
>>>> New dataset <DataSet> is added to Content...
>>>> Importing </Users/tfr23/Documents/affytest/Affy 0104 - 020305 CD8 -
>>>> 090213.CEL> as <Affy 0104 - 020305 CD8 - 090213.cel>...
>>>>  hybridization statistics:
>>>>     2 cells with minimal intensity 20
>>>>     1 cells with maximal intensity 24768
>>>> Importing </Users/tfr23/Documents/affytest/Affy 0104 - 030804 CD8 -
>>>> 090213.CEL> as <Affy 0104 - 030804 CD8 - 090213.cel>...
>>>>  hybridization statistics:
>>>>     6 cells with minimal intensity 25
>>>>     1 cells with maximal intensity 38526
>>>> Importing </Users/tfr23/Documents/affytest/Affy 0104 - 040107 CD8 -
>>>> 090213.CEL> as <Affy 0104 - 040107 CD8 - 090213.cel>...
>>>>  hybridization statistics:
>>>>     2 cells with minimal intensity 22
>>>>     1 cells with maximal intensity 20150
>>>> Importing </Users/tfr23/Documents/affytest/Affy 0104 - 061004 CD8 -
>>>> 090213.CEL> as <Affy 0104 - 061004 CD8 - 090213.cel>...
>>>>  hybridization statistics:
>>>>     2 cells with minimal intensity 20
>>>>     1 cells with maximal intensity 21650
>>>> Importing </Users/tfr23/Documents/affytest/Affy 0104 - 070205 CD8 -
>>>> 090213.CEL> as <Affy 0104 - 070205 CD8 - 090213.cel>...
>>>>  hybridization statistics:
>>>>     2 cells with minimal intensity 21
>>>>     1 cells with maximal intensity 23005
>>>> Importing </Users/tfr23/Documents/affytest/Affy 0104 - 090305 CD8 -
>>>> 090213.CEL> as <Affy 0104 - 090305 CD8 - 090213.cel>...
>>>>  hybridization statistics:
>>>>     22 cells with minimal intensity 21
>>>>     1 cells with maximal intensity 21205
>>>> Importing </Users/tfr23/Documents/affytest/Affy 0104 - 110806B CD8 -
>>>> 090213.CEL> as <Affy 0104 - 110806B CD8 - 090213.cel>...
>>>>  hybridization statistics:
>>>>     1 cells with minimal intensity 21
>>>>     1 cells with maximal intensity 22958
>>>> Importing </Users/tfr23/Documents/affytest/Affy 0104 - 150107 CD8 -
>>>> 090213.CEL> as <Affy 0104 - 150107 CD8 - 090213.cel>...
>>>>  hybridization statistics:
>>>>     2 cells with minimal intensity 19
>>>>     1 cells with maximal intensity 23606
>>>> Importing </Users/tfr23/Documents/affytest/Affy 0104 - 150405 CD8 -
>>>> 090213.CEL> as <Affy 0104 - 150405 CD8 - 090213.cel>...
>>>>  hybridization statistics:
>>>>     4 cells with minimal intensity 24
>>>>     1 cells with maximal intensity 24268
>>>> Importing </Users/tfr23/Documents/affytest/Affy 0104 - 190706 CD8 -
>>>> 090213.CEL> as <Affy 0104 - 190706 CD8 - 090213.cel>...
>>>>  hybridization statistics:
>>>>     6 cells with minimal intensity 21
>>>>     1 cells with maximal intensity 22769
>>>> Importing </Users/tfr23/Documents/affytest/Affy 0104 - 300605 CD8 -
>>>> 090213.CEL> as <Affy 0104 - 300605 CD8 - 090213.cel>...
>>>>  hybridization statistics:
>>>>     2 cells with minimal intensity 20
>>>>     1 cells with maximal intensity 22309
>>>> Importing </Users/tfr23/Documents/affytest/Affy 0104 -040205 CD8 -
>>>> 090213.CEL> as <Affy 0104 -040205 CD8 - 090213.cel>...
>>>>  hybridization statistics:
>>>>     1 cells with minimal intensity 23
>>>>     1 cells with maximal intensity 22497
>>>> Creating new file
>>>> </Users/tfr23/Documents/affytest/HuGeneMixRMAMetacore.root>...
>>>> Opening file
>>>> </Users/tfr23/Documents/resources/HuGene-1_0/HuGene-1_0-st-v1-r4.root>
>>>> in <READ> mode...
>>>> Preprocessing data using method <preprocess>...
>>>>  Background correcting raw data...
>>>>     setting selector mask for typepm <8252>
>>>>     calculating background for <Affy 0104 - 020206A CD8 - 090213.cel>...
>>>>     background statistics:
>>>>        1097995 cells with minimal intensity 0
>>>>        1378 cells with maximal intensity 151.284
>>>>     calculating background for <Affy 0104 - 020305 CD8 - 090213.cel>...
>>>>     background statistics:
>>>>        1097995 cells with minimal intensity 0
>>>>        2 cells with maximal intensity 75.9992
>>>>     calculating background for <Affy 0104 - 030804 CD8 - 090213.cel>...
>>>>     background statistics:
>>>>        1097995 cells with minimal intensity 0
>>>>        28 cells with maximal intensity 122.454
>>>>     calculating background for <Affy 0104 - 040107 CD8 - 090213.cel>...
>>>>     background statistics:
>>>>        1097995 cells with minimal intensity 0
>>>>        13 cells with maximal intensity 154.02
>>>>     calculating background for <Affy 0104 - 061004 CD8 - 090213.cel>...
>>>>     background statistics:
>>>>        1097995 cells with minimal intensity 0
>>>>        47 cells with maximal intensity 101.165
>>>>     calculating background for <Affy 0104 - 070205 CD8 - 090213.cel>...
>>>>     background statistics:
>>>>        1097995 cells with minimal intensity 0
>>>>        25 cells with maximal intensity 94.408
>>>>     calculating background for <Affy 0104 - 090305 CD8 - 090213.cel>...
>>>>     background statistics:
>>>>        1097995 cells with minimal intensity 0
>>>>        220 cells with maximal intensity 52.9483
>>>>     calculating background for <Affy 0104 - 110806B CD8 - 090213.cel>...
>>>>     background statistics:
>>>>        1097995 cells with minimal intensity 0
>>>>        97 cells with maximal intensity 136.739
>>>>     calculating background for <Affy 0104 - 150107 CD8 - 090213.cel>...
>>>>     background statistics:
>>>>        1097995 cells with minimal intensity 0
>>>>        1055 cells with maximal intensity 105.265
>>>>     calculating background for <Affy 0104 - 150405 CD8 - 090213.cel>...
>>>>     background statistics:
>>>>        1097995 cells with minimal intensity 0
>>>>        36 cells with maximal intensity 128.385
>>>>     calculating background for <Affy 0104 - 190706 CD8 - 090213.cel>...
>>>>     background statistics:
>>>>        1097995 cells with minimal intensity 0
>>>>        957 cells with maximal intensity 135.396
>>>>     calculating background for <Affy 0104 - 300605 CD8 - 090213.cel>...
>>>>     background statistics:
>>>>        1097995 cells with minimal intensity 0
>>>>        865 cells with maximal intensity 49.4309
>>>>     calculating background for <Affy 0104 -040205 CD8 - 090213.cel>...
>>>>     background statistics:
>>>>        1097995 cells with minimal intensity 0
>>>>        650 cells with maximal intensity 140.053
>>>>  Normalizing raw data...
>>>>     normalizing data using method <quantile>...
>>>>     setting selector mask for typepm <8252>
>>>>        finished filling <13> arrays.           90213>...
>>>>        finished filling <13> trees.          090213.cqu>...
>>>>  Converting raw data to expression levels...
>>>>     summarizing with <medianpolish>...
>>>>     setting selector mask for typepm <8252>
>>>>     setting selector mask for typepm <8252>
>>>>     calculating expression for <57> of <257430> units...Finished.
>>>>     expression statistics:
>>>>        minimal expression level is <19.8498>
>>>>        maximal expression level is <8953.24>
>>>>  preprocessing finished.
>>>> Opening file
>>>> </Users/tfr23/Documents/resources/HuGene-1_0/HuGene-1_0-st-v1-r4.root>
>>>> in <READ> mode...
>>>> Opening file </Users/tfr23/Documents/affytest/HuGeneMixRMAMetacore.root>
>>>> in <READ> mode...
>>>> Exporting data from tree <*> to file
>>>> </Users/tfr23/Documents/affytest/HuGeneMixRMAMetacore.txt>...
>>>> Reading entries from <HuGene-1_0-st-v1.ann> ...Finished
>>>> <57> of <57> records exported.
>>>>
>>>> _______________________________________________
>>>> Bioconductor mailing list
>>>> Bioconductor at stat.math.ethz.ch
>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>>> Search the archives:
>>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>
>>
>>
>
>



More information about the Bioconductor mailing list