[BioC] in-group missing arrayQualityMetrics()

Sat Feb 8 12:53:19 CET 2014

Dear Manjula

thank you. If you can, plase send me the GeneFeatureSet object anyway, to figure ou where in ‘arrayQualityMetrics’ the class-dependent dispatchting goes wrong and causes the cryptic error that you saw - and to allow us to provide a more useful error message instead.

The error with UseMethod("xmlAttrs", node) that you are seeing is most likely an unintended consequence of version mismatch between the cairo systems library on your system and the SVGAnnotation package. It is a well known deficiency with a colourful history, see. e.g. https://stat.ethz.ch/pipermail/bioconductor/2011-October/041511.html
What do you get from 
   pkg-config --modversion cairo libxml-2.0
on the (system, not R) command line? Ideally an update will help.

Regarding the colours, please type
  arrayQualityMetrics:::intgroupColors
(in the R command line). You would have modify this function (e.g. in at arrayQualityMetrics source package) with a palette that has at least 12 colours. Any patches / suggestions here are welcome, and I’d be happy to insert them back into future versions of the package. As a work around, group the factor levels together so that there are <=9. Otherwise it may anyway be hard to distinguish them in the plots.

	Wolfgang

On 7 Feb 2014, at 21:58, Kasoji, Manjula (NIH/NCI) [C] <manjula.kasoji at nih.gov> wrote:

> 
> Dear Wolfgang,
> 
> I have figured out a workaround
> My dat seems to be a GeneFeatureSet object:
> 
>> dat
> GeneFeatureSet (storageMode: lockedEnvironment)
> assayData: 1102500 features, 48 samples
>  element names: exprs
> protocolData
>  rowNames: 01_4T1_mouse1.CEL 02_4T1_mouse2.CEL ... 48_R3T_mouse4.CEL
>    (48 total)
>  varLabels: exprs dates
>  varMetadata: labelDescription channel
> phenoData
>  rowNames: 01_4T1_mouse1.CEL 02_4T1_mouse2.CEL ... 48_R3T_mouse4.CEL
>    (48 total)
>  varLabels: ChipType SampleID ... DatePrepared (13 total)
>  varMetadata: labelDescription channel
> featureData: none
> experimentData: use 'experimentData(object)'
> Annotation: pd.mogene.1.0.st.v1
> 
> I think the arrayQualityMetrics() function takes in objects such as
> ExpressionSet and NChannelSet, etcŠ.
> 
> 
> So I used the Biobase package to convert to an ExpressionSet like below:
> 
>> raw.expr=exprs(dat)
> 
> 	
> 		
> 		
> 	
>> minimalSet <- ExpressionSet(assayData=raw.exprs)
> 
> This seems to work now.
> 
> I do have another question. I have a factor (grouping) that has 12 levels.
> However only the first 9 are used. Is there a way to set the maximum to a
> number higher than 9. For example this is the error and warning messages I
> receive. Also, if you can explain what the error message means and how to
> fix it that would be helpful.
> 
> The directory 'QC_Normalized_SampleID' has been created.
> Error in UseMethod("xmlAttrs", node) :
>  no applicable method for 'xmlAttrs' applied to an object of class "NULL"
> In addition: Warning messages:
> 1: In maximumLevels(fac, n = length(colors)) :
>  A factor was provided with 12 levels, but only the first 9 were used for
> coloring.
> 2: In maximumLevels(fac, n = length(colors)) :
>  A factor was provided with 12 levels, but only the first 9 were used for
> coloring.
> KernSmooth 2.23 loaded
> Copyright M. P. Wand 1997-2009
> 
> 
> 
> Thank you so much!
> 
> 
> 
> On 2/6/14 6:27PM, "Wolfgang Huber" <whuber at embl.de> wrote:
> 
>> Dear Guest
>> 
>> Thank you for the report. The vignette of the package contains examples
>> where the Œintgroup¹ argument is used, and the package builds
>> (http://bioconductor.org/checkResults/release/bioc-LATEST/#A). Since the
>> only difference to your example is the dataset Œdat¹, we would need to
>> investigate that to understand how it could possibly create the problem
>> you report. Would it be possible for you to send me your object and the
>> code needed to reproduce the problem, starting from a fresh R session?
>> 
>> 	Best wishes
>> 	Wolfgang
>> 
>> 
>> 
>> On 5 Feb 2014, at 23:18, guest [guest] <guest at bioconductor.org> wrote:
>> 
>>> 
>>> Hi I'm trying to run arrayQualityMetrics and I keep getting an error
>>> message saying that the intgroup is missing. However, I am very clearly
>>> specifying the intgroup and it definitely exists in my phenodata. I'm
>>> working with mogene.1.0.st.v1 arrays. Please see my error message and a
>>> few lines of my phenodata below:
>>> 
>>> Error message:
>>> 
>>>> arrayQualityMetrics(expressionset = dat, intgroup="SampleID",outdir =
>>>> "QC_test", force =TRUE, do.logtransform = TRUE);
>>> The directory 'QC_test' has been created.
>>> Error in match(x, table, nomatch = 0L) :
>>> argument "intgroup" is missing, with no default
>>> 
>>> phenodata:
>>> 
>>>> pData(dat)
>>>                             ChipType SampleID MouseStrain
>>> TumorOrigin
>>> 01_4T1_mouse1.CEL     MoGene-1_0-st-v1    4T1_1      Balb/c
>>> Spontaneous
>>> 02_4T1_mouse2.CEL     MoGene-1_0-st-v1    4T1_2      Balb/c
>>> Spontaneous
>>> 03_4T1_mouse3.CEL     MoGene-1_0-st-v1    4T1_3      Balb/c
>>> Spontaneous
>>> 04_4T1_mouse4.CEL     MoGene-1_0-st-v1    4T1_4      Balb/c
>>> Spontaneous
>>> 05_EMT-6_mouse1.CEL   MoGene-1_0-st-v1   EMT6_1      Balb/c
>>> Spontaneous
>>> 06_EMT-6_mouse2.CEL   MoGene-1_0-st-v1   EMT6_2      Balb/c
>>> Spontaneous
>>> 
>>> 
>>> Any suggestions on how to fix this will be appreciated.
>>> 
>>> -- output of sessionInfo():
>>> 
>>>> sessionInfo()
>>> R version 3.0.2 (2013-09-25)
>>> Platform: x86_64-unknown-linux-gnu (64-bit)
>>> 
>>> locale:
>>> [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
>>> [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
>>> [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
>>> [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
>>> [9] LC_ADDRESS=C               LC_TELEPHONE=C
>>> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>>> 
>>> attached base packages:
>>> [1] parallel  stats     graphics  grDevices utils     datasets  methods
>>> 
>>> [8] base     
>>> 
>>> other attached packages:
>>> [1] affyPLM_1.38.0             preprocessCore_1.24.0
>>> [3] gcrma_2.34.0               affy_1.40.0
>>> [5] arrayQualityMetrics_3.18.0 pd.mogene.1.0.st.v1_3.8.0
>>> [7] RSQLite_0.11.4             DBI_0.2-7
>>> [9] oligo_1.26.0               Biobase_2.22.0
>>> [11] oligoClasses_1.24.0        BiocGenerics_0.8.0
>>> 
>>> loaded via a namespace (and not attached):
>>> [1] affxparser_1.34.0    affyio_1.30.0        annotate_1.40.0
>>> [4] AnnotationDbi_1.24.0 beadarray_2.12.0     BeadDataPackR_1.14.0
>>> [7] BiocInstaller_1.12.0 Biostrings_2.30.1    bit_1.1-11
>>> [10] Cairo_1.5-5          cluster_1.14.4       codetools_0.2-8
>>> [13] colorspace_1.2-4     ff_2.2-12            foreach_1.4.1
>>> [16] Formula_1.1-1        genefilter_1.44.0    GenomicRanges_1.14.4
>>> [19] grid_3.0.2           Hmisc_3.13-0         hwriter_1.3
>>> [22] IRanges_1.20.6       iterators_1.0.6      lattice_0.20-24
>>> [25] latticeExtra_0.6-26  limma_3.18.7         plyr_1.8
>>> [28] RColorBrewer_1.0-5   reshape2_1.2.2       setRNG_2011.11-2
>>> [31] splines_3.0.2        stats4_3.0.2         stringr_0.6.2
>>> [34] survival_2.37-4      SVGAnnotation_0.93-1 vsn_3.30.0
>>> [37] XML_3.98-1.1         xtable_1.7-1         XVector_0.2.0
>>> [40] zlibbioc_1.8.0
>>>> 
>>> 
>>> 
>>> --
>>> Sent via the guest posting facility at bioconductor.org.
>>> 
>>> _______________________________________________
>>> Bioconductor mailing list
>>> Bioconductor at r-project.org
>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>> Search the archives:
>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>> 
>