[R] XML segfault on some architectures

Janet Young jayoung at fhcrc.org
Wed Jun 8 23:27:55 CEST 2011


Hi,

Our sysadmin updated libxml2 to version 2.7.8, and now xmlTreeParse works fine with no segfault.

Thank you very much - that was very helpful,

Janet



On Jun 8, 2011, at 11:59 AM, Janet Young wrote:

> Dear Prof Ripley,
> 
> Apologies - I've re-sent that to Duncan Temple Lang, along with your note about lib versions. 
> 
> Version info was included in my original post - I gave full sessionInfo(). It's XML_3.4-0.
> 
> I only have a very sketchy understanding of libraries and systems administration, but it looks like our libxml2 is version 2.6.26.  I'll ask my sysadmin people whether they can update that, and try again.
> 
> Janet
> 
> 
> 
> On Jun 7, 2011, at 10:54 PM, Prof Brian Ripley wrote:
> 
>> On Tue, 7 Jun 2011, Janet Young wrote:
>> 
>>> Hi,
>>> 
>>> I found an architecture-specific segfault problem with the XML package. I originally found the problem using the parseKGML2Graph function in the Bioconductor KEGGgraph package, but as far as I can tell the underlying issue seems to be with the xmlTreeParse which is called by parseKGML2Graph.
>>> 
>>> I'm trying this piece of code, from the xmlTreeParse help page:
>>> 
>>> library(XML)
>>> fileName <- system.file("exampleData", "test.xml", package="XML")
>>> x <- xmlTreeParse(fileName)
>>> 
>>> On my Mac and on nodes of one of the linux clusters I have access to, this works fine. But on another of the linux clusters I use, I get a segfault every time, on both 32-bit and 64-bit nodes of the cluster.  The unames for those nodes are here:
>>> 
>>> Linux kong053 2.6.18-194.17.1.el5xen #1 SMP Wed Sep 29 13:30:21 EDT 2010 x86_64 x86_64 x86_64 GNU/Linux
>>> Linux king049 2.6.18-194.26.1.el5xen #1 SMP Tue Nov 9 14:13:46 EST 2010 i686 i686 i386 GNU/Linux
>>> 
>>> I think I've included all the relevant info below, but please let me know if there's anything else you'd like to see.
>> 
>> As the posting guide says, report problems in contributed packages first to the maintainer, giving the 'at a minimum' information required (which includes the package version number).
>> 
>> But note that package XML relies on libxml2, and it is entirely possible the fault is in the latter.  Your kernel looks like RHEL 5 (and is an old version): that is well known for having very old versions of system software.  One known issue with libxml2 is a mismatch between it and zlib 1.2.[45] prior to libxml2 2.7.7 (2.7.8 is current): from experience, that causes segfaults in package XML's examples.
>> 
>>> 
>>> thanks,
>>> 
>>> Janet
>>> 
>>> -------------------------------------------------------------------
>>> 
>>> Dr. Janet Young
>>> 
>>> Fred Hutchinson Cancer Research Center
>>> 1100 Fairview Avenue N., C3-168,
>>> P.O. Box 19024, Seattle, WA 98109-1024, USA.
>>> 
>>> tel: (206) 667 1471 fax: (206) 667 6524
>>> email: jayoung  ...at...  fhcrc.org
>>> 
>>> 
>>> -------------------------------------------------------------------
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> ######## on 64-bit node
>>> 
>>>> library(XML)
>>> 
>>>> fileName <- system.file("exampleData", "test.xml", package="XML")
>>> 
>>>> fileName
>>> [1] "/home/btrask/traskdata/lib_linux_64/R/library/XML/exampleData/test.xml"
>>> 
>>>> sessionInfo()
>>> R version 2.13.0 (2011-04-13)
>>> Platform: x86_64-unknown-linux-gnu (64-bit)
>>> 
>>> locale:
>>> [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
>>> [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
>>> [5] LC_MONETARY=C              LC_MESSAGES=en_US.UTF-8
>>> [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
>>> [9] LC_ADDRESS=C               LC_TELEPHONE=C
>>> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>>> 
>>> attached base packages:
>>> [1] stats     graphics  grDevices utils     datasets  methods   base
>>> 
>>> other attached packages:
>>> [1] XML_3.4-0
>>> 
>>> 
>>>> system("uname -a")
>>> Linux kong053 2.6.18-194.17.1.el5xen #1 SMP Wed Sep 29 13:30:21 EDT 2010 x86_64 x86_64 x86_64 GNU/Linux
>>> 
>>>> x <- xmlTreeParse(fileName)
>>> 
>>> *** caught segfault ***
>>> address 0x500001c4f, cause 'memory not mapped'
>>> 
>>> Traceback:
>>> 1: .Call("RS_XML_ParseTree", as.character(file), handlers, as.logical(ignoreBlanks),     as.logical(replaceEntities), as.logical(asText), as.logical(trim),     as.logical(validate), as.logical(getDTD), as.logical(isURL),     as.logical(addAttributeNamespaces), as.logical(useInternalNodes),     FALSE, as.logical(isSchema), as.logical(fullNamespaceInfo),     as.character(encoding), as.logical(useDotNames), xinclude,     error, addFinalizer, PACKAGE = "XML")
>>> 2: xmlTreeParse(fileName)
>>> 
>>> Possible actions:
>>> 1: abort (with core dump, if enabled)
>>> 2: normal R exit
>>> 3: exit R without saving workspace
>>> 4: exit R saving workspace
>>> Selection:
>>> 
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>> 
>> 
>> -- 
>> Brian D. Ripley,                  ripley at stats.ox.ac.uk
>> Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
>> University of Oxford,             Tel:  +44 1865 272861 (self)
>> 1 South Parks Road,                     +44 1865 272866 (PA)
>> Oxford OX1 3TG, UK                Fax:  +44 1865 272595
> 



More information about the R-help mailing list