[BioC] Yeast 2.0 Arrays - alternative ways
Mike Walter
michael_walter at email.de
Fri Jul 9 14:20:02 CEST 2010
Hi Jim,
I thought of two workarounds to mask the S pombe probes with the code attached below. However, in both attempt R crashes, when I call the rma() function either with the NAs introduced in the Affybatch or the subset option. I guess this would also happen, when I use the mask file to generate the cel file and use rm.mask=T in the ReadAffy() call? Do you (or anyone else) have any explanation for the breakdown of R? Btw, mas5() also gives an error on option 1 due to NAs.
Kind regards,
Mike
Code for 1st Option:
library(affy)
celfile = "X:\\affy\\array\\2010\\E10R027"
filenames = list.files(path=celfile)
filenames = filenames[grep(".CEL", filenames)]
filenames
d1 = ReadAffy(celfile.path=celfile, filenames=filenames)
msk=read.table("F:/Auswertung/Array Annotationen/Affy/s_cerevisiae.msk", header=F, skip=2, row.names=1)
pombe = c(as.vector(unlist(pmindex(d1)[rownames(msk)])),
as.vector(unlist(mmindex(d1)[rownames(msk)])))
exprs(d1)[pombe,]=NA
apply(pm(d1), 2, function(x) sum(!is.na(x)))
#drops from 120855 to 65552 features
d1n = rma(d1)
#R crashes completely with following error information:
AppName: rgui.exe AppVer: 2.100.50208.0 ModName: preprocesscore.dll
ModVer: 0.0.0.0 Offset: 00017253
Code for 2nd Option:
d2 = ReadAffy(celfile.path=celfile, filenames=filenames)
msk.sp=read.table("F:/Auswertung/Array Annotationen/Affy/s_pombe.msk", header=F, skip=2, row.names=1)
head(msk.sp)
d2n = rma(d2, subset=rownames(msk.sp))
#again R crashes completely!!!
AppName: rgui.exe AppVer: 2.101.50720.0 ModName: preprocesscore.dll
ModVer: 0.0.0.0 Offset: 00017253
> sessionInfo()
R version 2.10.1 (2009-12-14)
i386-pc-mingw32
locale:
[1] LC_COLLATE=German_Germany.1252 LC_CTYPE=German_Germany.1252
[3] LC_MONETARY=German_Germany.1252 LC_NUMERIC=C
[5] LC_TIME=German_Germany.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] yeast2cdf_2.5.0 affy_1.24.2 Biobase_2.6.1
loaded via a namespace (and not attached):
[1] affyio_1.14.0 preprocessCore_1.8.0 tools_2.10.1
--
MFT Services
University of Tübingen
Calwerstr. 7
72076 Tübingen/GERMANY
Tel.: +49 7071 29 83210
Fax. + 49 7071 29 5228
web: www.mft-services.de
Confidentiality Note:
This message is intended only for the use of the named recipient(s) and may
contain confidential and/or proprietary information. If you are not the intended
recipient, please contact the sender and delete the message. Any unauthorized
use of the information contained in this message is prohibited.
-----Ursprüngliche Nachricht-----
Von: "James W. MacDonald" <jmacdon at med.umich.edu>
Gesendet: 08.07.2010 15:43:26
An: Mike Walter <michael_walter at email.de>
Betreff: Re: [BioC] Yeast 2.0 Arrays
>Hi Mike,
>
>On 7/8/2010 8:40 AM, Mike Walter wrote:
>> Hi there,
>>
>> I have a question on Affymetrix Yeast 2.0 arrays. These arrays
>> contain about 5000 probesets for budding and fission yeast,
>> respectively. We hybridized budding yeast sample and now I'd like to
>> get rid of all the fission yeast probes before normalization. There
>> is a mask file outlining which probesets belong to which organism.
>> However, when I use ReadAffy(..., rm.mask=T) the number of probes
>> does not change (s. below). Is there a way to specify a mask file
>> during the affybatch import?
>
>I assume you are using the mask file when scanning the chip?
>
>If so, the probes for fission yeast should all be NA, which your test
>below will not detect (having a bunch of NA probes won't affect the
>dimension of your pm matrices). Something like
>
>pms <- pm(d1, LISTRUE = TRUE)
>
>length(pms) - sum(sapply(pms, function(x) all(is.na(x)))
>
>or now that I think about it,
>
>apply(pm(d1), 2, function(x) sum(!is.na(x)))
>
>should give you a measure of how many non NA probesets remain.
>
>Best,
>
>Jim
>
>
>>
>> Any hints are welcome.
>>
>> Mike
>>
>>> d1<- ReadAffy(celfile.path=celfile, filenames=filenames,
>>> rm.mask=F) d2<- ReadAffy(celfile.path=celfile, filenames=filenames,
>>> rm.mask=T) dim(pm(d1))
>> [1] 120855 5
>>> dim(pm(d2))
>> [1] 120855 5
>>> sessionInfo()
>> R version 2.10.1 (2009-12-14) i386-pc-mingw32
>>
>> locale: [1] LC_COLLATE=German_Germany.1252
>> LC_CTYPE=German_Germany.1252 [3] LC_MONETARY=German_Germany.1252
>> LC_NUMERIC=C [5] LC_TIME=German_Germany.1252
>>
>> attached base packages: [1] stats graphics grDevices utils
>> datasets methods base
>>
>> other attached packages: [1] yeast2cdf_2.5.0 affyQCReport_1.24.0
>> lattice_0.18-3 [4] RColorBrewer_1.0-2 affyPLM_1.22.0
>> preprocessCore_1.8.0 [7] xtable_1.5-6 simpleaffy_2.22.0
>> gcrma_2.18.1 [10] genefilter_1.28.2 affy_1.24.2
>> Biobase_2.6.1
>>
>> loaded via a namespace (and not attached): [1] affyio_1.14.0
>> annotate_1.24.1 AnnotationDbi_1.8.2 [4] Biostrings_2.14.12
>> DBI_0.2-5 grid_2.10.1 [7] IRanges_1.4.16 RSQLite_0.9-1
>> splines_2.10.1 [10] survival_2.35-8 tools_2.10.1
>>>
>>
>
>--
>James W. MacDonald, M.S.
>Biostatistician
>Douglas Lab
>University of Michigan
>Department of Human Genetics
>5912 Buhl
>1241 E. Catherine St.
>Ann Arbor MI 48109-5618
>734-615-7826
>**********************************************************
>Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues
More information about the Bioconductor
mailing list