[BioC] Yeast 2.0 Arrays - alternative ways

Fri Jul 9 14:20:02 CEST 2010

Hi Jim,

I thought of two workarounds to mask the S pombe probes with the code attached below. However, in both attempt R crashes, when I call the rma() function either with the NAs introduced in the Affybatch or the subset option. I guess this would also happen, when I use the mask file to generate the cel file and use rm.mask=T in the ReadAffy() call? Do you (or anyone else) have any explanation for the breakdown of R? Btw, mas5() also gives an error on option 1 due to NAs.

Kind regards, 

Mike

Code for 1st Option:

library(affy)
celfile = "X:\\affy\\array\\2010\\E10R027"
filenames = list.files(path=celfile)
filenames = filenames[grep(".CEL", filenames)]
filenames

d1 = ReadAffy(celfile.path=celfile, filenames=filenames)
msk=read.table("F:/Auswertung/Array Annotationen/Affy/s_cerevisiae.msk", header=F, skip=2, row.names=1)
pombe = c(as.vector(unlist(pmindex(d1)[rownames(msk)])),
as.vector(unlist(mmindex(d1)[rownames(msk)])))
exprs(d1)[pombe,]=NA
apply(pm(d1), 2, function(x) sum(!is.na(x)))
#drops from 120855 to 65552 features
d1n = rma(d1) 

#R crashes completely with following error information:

AppName: rgui.exe     AppVer: 2.100.50208.0     ModName: preprocesscore.dll
ModVer: 0.0.0.0     Offset: 00017253

Code for 2nd Option: 

d2 = ReadAffy(celfile.path=celfile, filenames=filenames)
msk.sp=read.table("F:/Auswertung/Array Annotationen/Affy/s_pombe.msk", header=F, skip=2, row.names=1)
head(msk.sp)

d2n = rma(d2, subset=rownames(msk.sp))

#again R crashes completely!!!

AppName: rgui.exe     AppVer: 2.101.50720.0     ModName: preprocesscore.dll
ModVer: 0.0.0.0     Offset: 00017253

> sessionInfo()
R version 2.10.1 (2009-12-14) 
i386-pc-mingw32 

locale:
[1] LC_COLLATE=German_Germany.1252  LC_CTYPE=German_Germany.1252   
[3] LC_MONETARY=German_Germany.1252 LC_NUMERIC=C                   
[5] LC_TIME=German_Germany.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] yeast2cdf_2.5.0 affy_1.24.2     Biobase_2.6.1  

loaded via a namespace (and not attached):
[1] affyio_1.14.0        preprocessCore_1.8.0 tools_2.10.1      

-- 

MFT Services
University of Tübingen
Calwerstr. 7
72076  Tübingen/GERMANY

Tel.: +49 7071 29 83210
Fax. + 49 7071 29 5228
web: www.mft-services.de

Confidentiality Note:
This message is intended only for the use of the named recipient(s) and may
contain confidential and/or proprietary information. If you are not the intended
recipient, please contact the sender and delete the message. Any unauthorized
use of the information contained in this message is prohibited.

-----Ursprüngliche Nachricht-----
Von: "James W. MacDonald" <jmacdon at med.umich.edu>
Gesendet: 08.07.2010 15:43:26
An: Mike Walter <michael_walter at email.de>
Betreff: Re: [BioC] Yeast 2.0 Arrays

>Hi Mike,
>
>On 7/8/2010 8:40 AM, Mike Walter wrote:
>> Hi there,
>>
>> I have a question on Affymetrix Yeast 2.0 arrays. These arrays
>> contain about 5000 probesets for budding and fission yeast,
>> respectively. We hybridized budding yeast sample and now I'd like to
>> get rid of all the fission yeast probes before normalization. There
>> is a mask file outlining which probesets belong to which organism.
>> However, when I use ReadAffy(..., rm.mask=T) the number of probes
>> does not change (s. below). Is there a way to specify a mask file
>> during the affybatch import?
>
>I assume you are using the mask file when scanning the chip?
>
>If so, the probes for fission yeast should all be NA, which your test 
>below will not detect (having a bunch of NA probes won't affect the 
>dimension of your pm matrices). Something like
>
>pms <- pm(d1, LISTRUE = TRUE)
>
>length(pms) - sum(sapply(pms, function(x) all(is.na(x)))
>
>or now that I think about it,
>
>apply(pm(d1), 2, function(x) sum(!is.na(x)))
>
>should give you a measure of how many non NA probesets remain.
>
>Best,
>
>Jim
>
>
>>
>> Any hints are welcome.
>>
>> Mike
>>
>>> d1<- ReadAffy(celfile.path=celfile, filenames=filenames,
>>> rm.mask=F) d2<- ReadAffy(celfile.path=celfile, filenames=filenames,
>>> rm.mask=T) dim(pm(d1))
>> [1] 120855      5
>>> dim(pm(d2))
>> [1] 120855      5
>>> sessionInfo()
>> R version 2.10.1 (2009-12-14) i386-pc-mingw32
>>
>> locale: [1] LC_COLLATE=German_Germany.1252
>> LC_CTYPE=German_Germany.1252 [3] LC_MONETARY=German_Germany.1252
>> LC_NUMERIC=C [5] LC_TIME=German_Germany.1252
>>
>> attached base packages: [1] stats     graphics  grDevices utils
>> datasets  methods   base
>>
>> other attached packages: [1] yeast2cdf_2.5.0      affyQCReport_1.24.0
>> lattice_0.18-3 [4] RColorBrewer_1.0-2   affyPLM_1.22.0
>> preprocessCore_1.8.0 [7] xtable_1.5-6         simpleaffy_2.22.0
>> gcrma_2.18.1 [10] genefilter_1.28.2    affy_1.24.2
>> Biobase_2.6.1
>>
>> loaded via a namespace (and not attached): [1] affyio_1.14.0
>> annotate_1.24.1     AnnotationDbi_1.8.2 [4] Biostrings_2.14.12
>> DBI_0.2-5           grid_2.10.1 [7] IRanges_1.4.16      RSQLite_0.9-1
>> splines_2.10.1 [10] survival_2.35-8     tools_2.10.1
>>>
>>
>
>-- 
>James W. MacDonald, M.S.
>Biostatistician
>Douglas Lab
>University of Michigan
>Department of Human Genetics
>5912 Buhl
>1241 E. Catherine St.
>Ann Arbor MI 48109-5618
>734-615-7826
>**********************************************************
>Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues