[BioC] affy code archeology: expresso behavior for liwong/invariantset

David Eby [guest] guest at bioconductor.org
Thu Jan 24 22:41:22 CET 2013


Hello,

I am in the midst of updating some inherited legacy code from R 2.8 and affy_1.20.2 to R 2.15.2 and affy_1.36.0, trying to fix some longstanding bugs.

Everything is going well except for some very different results coming out of an expresso call in one of my regression tests, specifically from:
	eset <- expresso(afbatch, normalize.method="invariantset", bg.correct=FALSE, pmcorrect.method="pmonly",summary.method="liwong", verbose=TRUE) 

It's reporting expression values approx. 0.4-0.7x the values in the earlier setup (examples below).  I take for granted that many things have changed both at the R level and at the package level given the large leap in versions, but the difference seemed a bit odd since the results for other methods (RMA, GCRMA, MAS5) remained reasonably consistent in the update.

It's quite possible that there is a package conflict or that I've accidentally broken something in the code base, though the legacy code itself is largely unchanged and the other regression tests seem to hold up pretty well.  I have looked at a number of things already, but before heading further down that path I wanted to check whether the implementation itself might have changed in a way where these differences would be expected.  Has there been a major change to liwong/invariantset somewhere along the way between 1.20.2 and 1.36.0?

Checking with other members of the team, we're actually completely OK with the differences if they are in line with known changes to the affy package and can be explained to our users (this is a GenePattern module).

Here is some example output for reference.  This is for a cut-down data set; I've seen similar results with 20 samples.  I can provide more if it would be helpful.
>From the original setup (using write.table(exprs(eset)):
"CL20030502207AA.CEL" "CL20030502208AA.CEL" "CL20030502307AA.CEL" "CL20030502308AA.CEL"
"1007_s_at" 228.013212425507 214.877883873475 287.677963272274 306.997621485651
"1053_at" 193.206766296132 168.017787218035 169.430151901596 157.819950438341
<...snip...>

>From the new setup:
"CL20030502207AA.CEL" "CL20030502208AA.CEL" "CL20030502307AA.CEL" "CL20030502308AA.CEL"
"1007_s_at" 500.169414439674 461.734001304198 704.700664579873 735.43106514776
"1053_at" 321.921661422307 251.464504650157 261.491260842992 228.793486504634
<...snip...>

Thanks in advance!
Regards,
David

-- 
David Eby
Consultant
Cancer Informatics Development
Broad Institute of MIT and Harvard
7 Cambridge Center, Cambridge, MA 02142, USA
http://www.broadinstitute.org/cancer

 -- output of sessionInfo(): 

R version 2.15.2 (2012-10-26)
Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)

locale:
[1] C

attached base packages:
[1] methods   stats     graphics  grDevices utils     datasets  base     

other attached packages:
 [1] makecdfenv_1.36.0     gcrma_2.30.0          Biostrings_2.26.2    
 [4] affy_1.36.0           preprocessCore_1.20.0 affyio_1.26.0        
 [7] zlibbioc_1.4.0        AnnotationDbi_1.20.1  Biobase_2.18.0       
[10] IRanges_1.16.2        RSQLite_0.11.2        DBI_0.2-5            
[13] BiocGenerics_0.4.0    spatial_7.3-5         rpart_3.1-55         
[16] nnet_7.3-5            nlme_3.1-105          mgcv_1.7-21          
[19] Matrix_1.0-9          MASS_7.3-22           lattice_0.20-10      
[22] KernSmooth_2.23-8     foreign_0.8-51        cluster_1.14.3       
[25] class_7.3-5           boot_1.3-7           

loaded via a namespace (and not attached):
[1] BiocInstaller_1.8.3 grid_2.15.2         parallel_2.15.2    
[4] splines_2.15.2      stats4_2.15.2      

hgu133acdf_2.10.0 is not shown above but was also loaded at a later point.

--
Sent via the guest posting facility at bioconductor.org.



More information about the Bioconductor mailing list