[BioC] fRMA- more issues with custom CDFs

Cornwell, Adam Adam_Cornwell at URMC.Rochester.edu
Tue Apr 15 22:20:30 CEST 2014

I'd like to try to get to the bottom of this, since the issue has continued, and it would be great to be able to use fRMA with the MBNI CDFs. 
I can try it on a Linux system since I didn't do that yet, but do you have any other suggestions for trying to troubleshoot? I certainly hope it's not an issue between 32 and 64 bit versions of R.

Adam Cornwell

-----Original Message-----
From: Matthew McCall [mailto:mccallm at gmail.com] 
Sent: Tuesday, February 18, 2014 6:53 PM
To: Cornwell, Adam
Cc: bioconductor at r-project.org
Subject: Re: [BioC] fRMA- more issues with custom CDFs


It seems to work fine for me (see below). Not sure what the issue is; the only difference I see between our sessionInfo() is 64bit vs 32bit.

I am a few versions behind on the alternative CDFs -- currently all the frmavecs use entrezg v16. But you seem to have figured this out based on your sessionInfo().

As to your bigger question, using the alternative CDFs doesn't violate any assumptions in fRMA, and a fair number of other people use them together. I do have to make a separate set of frozen parameter vectors for these CDFs and since they are updated fairly regularly, I am often a version or two behind. But that's not a reason to not use them (and bug me when I fall too far behind).

> library(affy)
> library(hgu133plus2hsentrezgcdf)
Loading required package: AnnotationDbi
> tst=ReadAffy(filenames=dir()[5:7],cdfname="hgu133plus2hsentrezgcdf")
> library(frma)
> tst2=frma(tst)
> tst3=frma(tst,summarize="random_effect")
> sessionInfo()
R version 3.0.2 (2013-09-25)
Platform: i686-pc-linux-gnu (32-bit)

 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
 [9] LC_ADDRESS=C               LC_TELEPHONE=C

attached base packages:
[1] parallel  stats     graphics  grDevices utils     datasets  methods
[8] base

other attached packages:
[1] hgu133plus2frmavecs_1.3.0      frma_1.14.0
[3] hgu133plus2hsentrezgcdf_16.0.0 AnnotationDbi_1.24.0
[5] affy_1.40.0                    Biobase_2.22.0
[7] BiocGenerics_0.8.0

loaded via a namespace (and not attached):
 [1] affxparser_1.34.0     affyio_1.30.0         BiocInstaller_1.12.0
 [4] Biostrings_2.30.1     bit_1.1-11            codetools_0.2-8
 [7] DBI_0.2-7             ff_2.2-12             foreach_1.4.1
[10] GenomicRanges_1.14.4  IRanges_1.20.6        iterators_1.0.6
[13] MASS_7.3-29           oligo_1.26.0          oligoClasses_1.24.0
[16] preprocessCore_1.24.0 RSQLite_0.11.4        splines_3.0.2
[19] stats4_3.0.2          tools_3.0.2           XVector_0.2.0
[22] zlibbioc_1.8.0

On Tue, Feb 18, 2014 at 3:52 PM, Cornwell, Adam <Adam_Cornwell at urmc.rochester.edu> wrote:
> Hello,
> I've previously used fRMA and the MBNI BrainArray CDFs with some success, but the last time I tried was a year ago. I just updated to R 3.0.2 and the newest version of BioC. I'm trying to use fRMA with some U133+ 2.0 arrays, and getting errors with custom CDFs. It works fine with the stock CDFs. I've tried both robust_weighted_average and random_effect for summarization. I've also tried BrainArray CDF versions from 15 to 18 (all EntrezGene). The error is different between using random_effect and robust_weighted_average.
> With robust_weighted_average I get "Error in rcModelWPLM(y = x1, w = w.tmp, row.effects = pe.tmp, input.scale = x4) :
>   row.effects should sum to zero".
> Aside from the convenience of summarize straight to gene-level, would you actually recommend using the BrainArray CDFs with fRMA? It seems like the combination of probe weighting based on data, and probe selection based on re-annotation would produce some nice results- if using the custom CDFs doesn't break any assumptions made in fRMA.
> Thanks for the help again.
> R version 3.0.2 (2013-09-25)
> Platform: x86_64-w64-mingw32/x64 (64-bit)
> locale:
> [1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252    LC_MONETARY=English_United States.1252 LC_NUMERIC=C
> [5] LC_TIME=English_United States.1252
> attached base packages:
> [1] grid      parallel  stats     graphics  grDevices utils     datasets  methods   base
> other attached packages:
> [1] hgu133plus2cdf_2.13.0          hgu133plus2frmavecs_1.3.0      hgu133plus2hsentrezgcdf_16.0.0 heatmap.plus_1.3               frma_1.14.0
>  [6] GGally_0.4.5                   reshape_0.8.4                  plyr_1.8                       RColorBrewer_1.0-5             ggplot2_0.9.3.1
> [11] xlsx_0.5.5                     xlsxjars_0.5.0                 rJava_0.9-6                    annotate_1.40.0                org.Hs.eg.db_2.10.1
> [16] puma_3.4.0                     mclust_4.2                     VennDiagram_1.6.5              scatterplot3d_0.3-35           annaffy_1.34.0
> [21] KEGG.db_2.10.1                 GO.db_2.10.1                   RSQLite_0.11.4                 DBI_0.2-7                      AnnotationDbi_1.24.0
> [26] gplots_2.12.1                  MASS_7.3-29                    affyPLM_1.38.0                 preprocessCore_1.24.0          simpleaffy_2.38.0
> [31] gcrma_2.34.0                   genefilter_1.44.0              marray_1.40.0                  limma_3.18.12                  BiocInstaller_1.12.0
> [36] affy_1.40.0                    Biobase_2.22.0                 BiocGenerics_0.8.0
> loaded via a namespace (and not attached):
> [1] affxparser_1.34.0    affyio_1.30.0        Biostrings_2.30.1    bit_1.1-11           bitops_1.0-6         caTools_1.16         codetools_0.2-8
>  [8] colorspace_1.2-4     dichromat_2.0-0      digest_0.6.4         ff_2.2-12            foreach_1.4.1        gdata_2.13.2         GenomicRanges_1.14.4
> [15] gtable_0.1.2         gtools_3.3.0         IRanges_1.20.6       iterators_1.0.6      KernSmooth_2.23-10   labeling_0.2         munsell_0.4.2
> [22] oligo_1.26.2         oligoClasses_1.24.0  proto_0.3-10         reshape2_1.2.2       scales_0.2.3         splines_3.0.2        stats4_3.0.2
> [29] stringr_0.6.2        survival_2.37-7      tools_3.0.2          XML_3.98-1.1         xtable_1.7-1         XVector_0.2.0        zlibbioc_1.8.0
> Adam Cornwell
> Programmer/Analyst
>         [[alternative HTML version deleted]]
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://urldefense.proofpoint.com/v1/url?u=https://stat.ethz.ch/mailma
> n/listinfo/bioconductor&k=lmxj0uloiQslubycBXSv7A%3D%3D%0A&r=Auq75u7hOS
> JvX59nh1v9Ceep4qO8Nay06BTU40%2FCiZY%3D%0A&m=lwjQGwJ8jWntEd8BNeO%2Bygxk
> n6lb7zCXboHNTjpRErg%3D%0A&s=cdc56a2335216b5f88b6e855b7d9e24ede0a3ff934
> 952a24420069294bf2ef3c Search the archives: 
> https://urldefense.proofpoint.com/v1/url?u=http://news.gmane.org/gmane
> .science.biology.informatics.conductor&k=lmxj0uloiQslubycBXSv7A%3D%3D%
> 0A&r=Auq75u7hOSJvX59nh1v9Ceep4qO8Nay06BTU40%2FCiZY%3D%0A&m=lwjQGwJ8jWn
> tEd8BNeO%2Bygxkn6lb7zCXboHNTjpRErg%3D%0A&s=22fcc8170b6e2b3622d55187817
> a666966c88c9528896313b82ba2be07f8be7f

Matthew N McCall, PhD
112 Arvine Heights
Rochester, NY 14611
Cell: 202-222-5880

More information about the Bioconductor mailing list