[BioC] Yeast 2.0 Arrays - alternative ways

Fri Jul 9 16:43:54 CEST 2010

Hi Mike,

On 7/9/2010 8:20 AM, Mike Walter wrote:
> Hi Jim,
>
> I thought of two workarounds to mask the S pombe probes with the code
> attached below. However, in both attempt R crashes, when I call the
> rma() function either with the NAs introduced in the Affybatch or the
> subset option. I guess this would also happen, when I use the mask
> file to generate the cel file and use rm.mask=T in the ReadAffy()
> call? Do you (or anyone else) have any explanation for the breakdown
> of R? Btw, mas5() also gives an error on option 1 due to NAs.
>
> Kind regards,
>
> Mike
>
> Code for 1st Option:
>
> library(affy) celfile = "X:\\affy\\array\\2010\\E10R027" filenames =
> list.files(path=celfile) filenames = filenames[grep(".CEL",
> filenames)] filenames
>
> d1 = ReadAffy(celfile.path=celfile, filenames=filenames)
> msk=read.table("F:/Auswertung/Array
> Annotationen/Affy/s_cerevisiae.msk", header=F, skip=2, row.names=1)
> pombe = c(as.vector(unlist(pmindex(d1)[rownames(msk)])),
> as.vector(unlist(mmindex(d1)[rownames(msk)]))) exprs(d1)[pombe,]=NA
> apply(pm(d1), 2, function(x) sum(!is.na(x))) #drops from 120855 to
> 65552 features d1n = rma(d1)
>
> #R crashes completely with following error information:
>
> AppName: rgui.exe     AppVer: 2.100.50208.0     ModName:
> preprocesscore.dll ModVer: 0.0.0.0     Offset: 00017253

Yeah, that won't work. You can't have NA values, which is why the 
rm.mask argument exists for ReadAffy, to remove the NA values prior to 
running rma().

>
> Code for 2nd Option:
>
> d2 = ReadAffy(celfile.path=celfile, filenames=filenames)
> msk.sp=read.table("F:/Auswertung/Array
> Annotationen/Affy/s_pombe.msk", header=F, skip=2, row.names=1)
> head(msk.sp)
>
> d2n = rma(d2, subset=rownames(msk.sp))

I wonder if some of the things in the mask file don't actually exist on 
the chip? If I try this using the Dilution data, there are no problems:

 > rma(Dilution, subset = featureNames(Dilution)[1:250])
Background correcting
Normalizing
Calculating Expression
ExpressionSet (storageMode: lockedEnvironment)
assayData: 250 features, 4 samples
   element names: exprs
protocolData: none
phenoData
   sampleNames: 20A, 20B, 10A, 10B
   varLabels and varMetadata description:
     liver: amount of liver RNA hybridized to array in micrograms
     sn19: amount of central nervous system RNA hybridized to array in 
micrograms

     scanner: ID number of scanner used
featureData: none
experimentData: use 'experimentData(object)'
Annotation: hgu95av2

However, if I add some random fake probeset ID, I get a segfault as well:

 > rma(Dilution, subset = c(featureNames(Dilution)[1:250], "10432_s_at"))
Background correcting

Process C:/R-patched/bin/rterm.exe exited abnormally with code 5 at Fri 
Jul 09 10:40:12 2010

So I would first check that all the row names of your msk.sp data.frame 
are actually featureNames of your AffyBatch:

all(rownames(msk.sp) %in% featureNames(d2))

and if not, subset the offending rownames first.

Best,

Jim

>
> #again R crashes completely!!!
>
> AppName: rgui.exe     AppVer: 2.101.50720.0     ModName:
> preprocesscore.dll ModVer: 0.0.0.0     Offset: 00017253
>
>> sessionInfo()
> R version 2.10.1 (2009-12-14) i386-pc-mingw32
>
> locale: [1] LC_COLLATE=German_Germany.1252
> LC_CTYPE=German_Germany.1252 [3] LC_MONETARY=German_Germany.1252
> LC_NUMERIC=C [5] LC_TIME=German_Germany.1252
>
> attached base packages: [1] stats     graphics  grDevices utils
> datasets  methods   base
>
> other attached packages: [1] yeast2cdf_2.5.0 affy_1.24.2
> Biobase_2.6.1
>
> loaded via a namespace (and not attached): [1] affyio_1.14.0
> preprocessCore_1.8.0 tools_2.10.1
>
>
>
>
>

-- 
James W. MacDonald, M.S.
Biostatistician
Douglas Lab
University of Michigan
Department of Human Genetics
5912 Buhl
1241 E. Catherine St.
Ann Arbor MI 48109-5618
734-615-7826
**********************************************************
Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues