[BioC] Updated affy package? ReadAffy/ AffyBatch has changed and breaks my pipeline!

Mon Feb 6 13:43:43 CET 2012

Hi

I have made a pipeline for automating analysis of custom affymetrix
chips for a colleague.  After an R-update, the pipeline no longer
runs.  Working backwards from where the error occurs, I have found
that the structure produced by ReadAffy() is no longer the same.  I am
using exactly the same data set.  (The error itself a failed assign()
due to a locked binding, but I don't think that's related.)

Below are the lists of packages installed before and after the update,
with version numbers.  The 'old' R version is 2.13.2, the 'new' one is
2.14.1.

OldPackVers <- structure(c("1.30.0", "1.20.0", "1.28.5", "1.14.1", "1.30.0",
"0.0.1", "0.0.1", "2.13.2", "2.12.2", "2.20.4", "1.0-4.1", "1.3-2",
"1.12", "7.3-3", "1.14.1", "0.2-8", "2.13.2", "1.6.2", "2.13.2",
"0.2-5", "1.30.0", "1.6", "1.2.8", "0.8-48", "2.24.1", "2.8.2",
"1.2.5", "2.10.1", "2.13.2", "2.13.2", "2.13.2", "2.6.2", "1.10.6",
"2.23-6", "0.19-33", "3.8.3", "1.1.7", "1.30.0", "1.30.0", "7.3-14",
"1.0-3", "2.13.2", "2.10.0", "1.7-11", "3.1-103", "7.3-1", "1.14.0",
"3.1-51", "0.11.1", "7.3-3", "2.13.2", "2.13.2", "2.13.2", "2.36-10",
"2.13.2", "1.24.0", "1.30.0", "2.13.2", "2.13.2", "1.30.0"), .Names = c("affy",
"affyio", "affyPLM", "AnnotationDbi", "asprgdtua520520fcdf",
"asprgdtua520520fprobe", "bac01a520746fprobe", "base", "Biobase",
"Biostrings", "bitops", "boot", "caTools", "class", "cluster",
"codetools", "compiler", "corpcor", "datasets", "DBI", "DynDoc",
"e1071", "fdrtool", "foreign", "gcrma", "gdata", "GeneNet", "gplots",
"graphics", "grDevices", "grid", "gtools", "IRanges", "KernSmooth",
"lattice", "limma", "longitudinal", "makecdfenv", "marray", "MASS",
"Matrix", "methods", "Mfuzz", "mgcv", "nlme", "nnet", "preprocessCore",
"rpart", "RSQLite", "spatial", "splines", "stats", "stats4",
"survival", "tcltk", "timecourse", "tkWidgets", "tools", "utils",
"widgetTools"))

NewPackVers <- structure(c("1.32.1", "1.22.0", "1.30.0", "1.16.11", "1.32.0",
"0.0.1", "2.14.1", "2.14.0", "1.2.1", "2.22.0", "1.0-4.1", "1.3-4",
"1.12", "7.3-3", "1.14.1", "0.2-8", "2.14.1", "1.6.2", "2.14.1",
"0.2-5", "1.32.0", "1.6", "1.2.8", "0.8-48", "2.26.0", "2.8.2",
"1.2.5", "2.10.1", "2.14.1", "2.14.1", "2.14.1", "2.6.2", "1.12.5",
"2.23-7", "0.20-0", "3.10.2", "1.1.7", "1.32.0", "1.32.0", "7.3-16",
"1.0-3", "2.14.1", "2.12.0", "1.7-13", "3.1-103", "7.3-1", "2.14.1",
"1.16.0", "3.1-51", "0.11.1", "7.3-3", "2.14.1", "2.14.1", "2.14.1",
"2.36-10", "2.14.1", "1.26.0", "1.32.0", "2.14.1", "2.14.1",
"1.32.0", "1.0.0"), .Names = c("affy", "affyio", "affyPLM", "AnnotationDbi",
"asprgdtua520520fcdf", "asprgdtua520520fprobe", "base", "Biobase",
"BiocInstaller", "Biostrings", "bitops", "boot", "caTools", "class",
"cluster", "codetools", "compiler", "corpcor", "datasets", "DBI",
"DynDoc", "e1071", "fdrtool", "foreign", "gcrma", "gdata", "GeneNet",
"gplots", "graphics", "grDevices", "grid", "gtools", "IRanges",
"KernSmooth", "lattice", "limma", "longitudinal", "makecdfenv",
"marray", "MASS", "Matrix", "methods", "Mfuzz", "mgcv", "nlme",
"nnet", "parallel", "preprocessCore", "rpart", "RSQLite", "spatial",
"splines", "stats", "stats4", "survival", "tcltk", "timecourse",
"tkWidgets", "tools", "utils", "widgetTools", "zlibbioc"))

The differences in structure are thus:

datOld <- ReadAffy(celfile.path=subdircel)
datOld
> AffyBatch object
> size of arrays=716x716 features (24 kb)
> cdf=AsprgDTUa520520F (11186 affyids)
> number of samples=27
> number of genes=11186
> annotation=asprgdtua520520f
> notes=

datNew <- ReadAffy(celfile.path=subdircel)
datNew
> AffyBatch object
> size of arrays=716x716 features (15 kb)
> cdf=AsprgDTUa520520F (11186 affyids)
> number of samples=27
> number of genes=11186
> annotation=asprgdtua520520f
> notes=

The particular slot which causes trouble is phenoData;
datOld at phenoData
> An object of class "AnnotatedDataFrame"
>  sampleNames: 1a_N6h_ t1.CEL, 1b_N6h_t2.CEL, ..., 9c_N192h_t3.CEL  (27 total)
>  varLabels and varMetadata description:
>    sample: arbitrary numbering

datNew at phenoData
> An object of class "AnnotatedDataFrame"
>  sampleNames: 1a_N6h_ t1.CEL 1b_N6h_t2.CEL ... 9c_N192h_t3.CEL (27 total)
>  varLabels: sample
>  varMetadata: labelDescription

Is this a likely culprit?  Another option could be how 'get' works.
Basically my function 'get's the probe package name (the script
follows Gillespie's doi:10.1186/1756-0500-3-81).  In the first case, I
get

get(probepackagename)
> Object of class probetable data.frame with 121507 rows and 6 columns.

And in the second case, I get:
> Object of class probetable data.frame with 468384 rows and 6 columns.
> First 3 rows are:
>                    sequence x y    Probe.Set.Name Probe.Interrogation.Position
> 1                    <seq1> 5 1           <name1>                           13
> 2                    <seq2> 6 1           <name2>                           13
> 3                    <seq3> 7 1           <name3>                           13
>   Target.Strandedness
> 1           Antisense
> 2           Antisense
> 3           Antisense

Basically, I would like to know what has changed in the newer versions
that I have!  Could also be that the 'probepackagename' (which is
generated earlier in the script) has changed.  This would be the
result of an update in makeProbePackage() in AnnotationDbi.

I'll keep looking for ways to get round it, but thanks in advance!

Regards,