[BioC] segfault during rma using oligo on Nimblegen data

Mon Oct 29 15:13:19 CET 2012

Honestly, I don't remember any restriction on the name ('RANDOM' vs 'NOLOC').

Data from 2 different designs? This means 2 different NDFs and, given
that, I wouldn't be surprised if one runs into a problem. From what
you describe, it seems that the internal controls (the NAs in the XYS
files) are in different positions when you compare Design1 to Design2.
This will cause NAs to appear in your PM matrix and break RMA.

benilton

On 29 October 2012 12:55, Piet Jones <pietjones at gmail.com> wrote:
> Hi Benilton,
>
> Thanks for the response, as you said updating my R version did not
> remove the problem.
>
> The 'fixed' comes from me editing the ndf file to change the word
> 'RANDOM' to 'NOLOC', as some people have told me that this generates a
> problem when it comes to RMA (RANDOM being a keyword).
>
> I have also generated my own '.xys' files, as the experimental data on
> NCBI GEO only has ".pair" files. To do this I did the following:
>
> compare to prior 090918 Nimblegen data set to get co-ordinates of controls
> set co-ordinates of controls to NA
> parse pair files to xys files by looking at the appropriate column
> keep header line the same
>
> A bit of info on the experiment that might help, when looking at the
> header line. The "pair" files seem to from two different designs:
> 090918 and 090470 respectively. I have contacted the authors of the
> experiment, and they used Nimblescan to normalize without a problem,
> using the ndf file that came with the data.
>
> When I split the data into the 090918 and 090470 files, then normalize
> each individually, there is no problem and everything happens
> smoothly. As soon as I include one file that is from the 'alternative'
> design, then it crashes.
>
> Is there someway around this?
>
> Kind Regards,
> Piet Jones
>
>
> On Thu, Oct 25, 2012 at 5:48 PM, Benilton Carvalho
> <beniltoncarvalho at gmail.com> wrote:
>> The first recommendation is to update R. The current version is
>> 2.15.x. The reason I ask you this is because it is really hard to
>> offer support for versions that are long gone (I always work with the
>> current version and the devel one).
>>
>> After updating, try again. (I guess you'll have the same problem)
>>
>> If you do (run into problems again), my suggestion is for you to
>> explain how you got the "fixed" part into the NDF name. I guess
>> someone manipulated the file and possibly not as expected... Improper
>> handling of the NDF causes NA values to appear where they're not
>> expected.
>>
>> benilton
>>
>> On 24 October 2012 13:37, Piet Jones <pietjones at gmail.com> wrote:
>>> Dear Bioconductors,
>>>
>>> I have a problem when it comes to trying to normalize a Nimblegen data
>>> set using the oligo package. I have generated the appropriate package
>>> using the ndf file and the package 'pdInfoBuilder' (which generated
>>> 'pd.fixed.gpl13936.090918.vitus.exp'). I have installed it using:
>>>
>>> R CMD INSTALL pd.fixed.gpl13936.090918.vitus.exp
>>>
>>> My problem comes in when I trying to run the actual normalization
>>> using 'rma', specifically when it starts to do the background
>>> correction, R dies with a segfault. I have no idea how to debug this,
>>> below I have provided my R session with a sessionInfo() output, I have
>>> also provided the information that I could gleam from the core dump.
>>>
>>> Does anybody have any suggestion on what may be the problem, or how I
>>> should proceed to solve this (not that proficient with debugging R)?
>>>
>>> ----------------------------------------------------------------------------------------------------
>>>
>>>>library(oligo)
>>>>library('pd.fixed.gpl13936.090918.vitus.exp')
>>>
>>>> sessionInfo()
>>> R version 2.13.0 (2011-04-13)
>>> Platform: x86_64-unknown-linux-gnu (64-bit)
>>>
>>> locale:
>>>  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
>>>  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
>>>  [5] LC_MONETARY=C              LC_MESSAGES=en_US.UTF-8
>>>  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
>>>  [9] LC_ADDRESS=C               LC_TELEPHONE=C
>>> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>>>
>>> attached base packages:
>>> [1] stats     graphics  grDevices utils     datasets  methods   base
>>>
>>> other attached packages:
>>> [1] pd.fixed.gpl13936.090918.vitus.exp_0.0.1
>>> [2] RSQLite_0.11.1
>>> [3] DBI_0.2-5
>>> [4] oligo_1.16.2
>>> [5] preprocessCore_1.14.0
>>> [6] oligoClasses_1.14.0
>>> [7] Biobase_2.12.2
>>>
>>> loaded via a namespace (and not attached):
>>> [1] affxparser_1.24.0 affyio_1.20.0     Biostrings_2.20.4 bit_1.1-8
>>> [5] ff_2.2-7          IRanges_1.10.6    splines_2.13.0
>>>
>>>
>>>>xys <- list.files(pattern=".xys",full.names=T)
>>>> data_raw <- read.xysfiles(xys,pkgname='pd.fixed.gpl13936.090918.vitus.exp')
>>> Platform design info loaded.
>>> Checking designs for each XYS file... Done.
>>> Allocating memory... Done.
>>> Reading ./fixed_GSM881517_anther_a_431086A01.xys.
>>> Reading ./fixed_GSM881518_anther_b_431086A03.xys.
>>> Reading ./fixed_GSM881519_anther_m_431086A02.xys.
>>> Reading ./fixed_GSM881520_berry_06_10_a_364214A08.xys.
>>>
>>> ----------------------------------------------------------------------------------------------------
>>> I truncated the above there are a total of 162 xys files
>>> ----------------------------------------------------------------------------------------------------
>>>> data_norm <- rma(data_raw)
>>> Background correcting
>>>
>>>  *** caught segfault ***
>>> address 0xa2ca000, cause 'memory not mapped'
>>>
>>> Traceback:
>>>  1: .Call("rma_c_complete_copy", pmMat, pnVec, nPn, normalize,
>>> background,     bgversion, verbose, PACKAGE = "oligo")
>>>  2: basicRMA(pms, pnVec, normalize, background)
>>>  3: .local(object, ...)
>>>  4: rma(data_raw)
>>>  5: rma(data_raw)
>>>
>>> Possible actions:
>>> 1: abort (with core dump, if enabled)
>>> 2: normal R exit
>>> 3: exit R without saving workspace
>>> 4: exit R saving workspace
>>> Selection: 1
>>> aborting ...
>>> Segmentation fault
>>>
>>> ----------------------------------------------------------------------------------------------------
>>> Here is the information that is in the core dump:
>>> ----------------------------------------------------------------------------------------------------
>>> 14657880 at head002:~/Work/ilse/data/paper/GSE36128/test_all/fixed> gdb --core=core
>>> GNU gdb (GDB) SUSE (7.2-3.3)
>>> Copyright (C) 2010 Free Software Foundation, Inc.
>>> License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
>>> This is free software: you are free to change and redistribute it.
>>> There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
>>> and "show warranty" for details.
>>> This GDB was configured as "x86_64-suse-linux".
>>> For bug reporting instructions, please see:
>>> <http://www.gnu.org/software/gdb/bugs/>.
>>> BFD: Warning: /export/home/14657880/Work/ilse/data/paper/GSE36128/test_all/fixed/core
>>> is truncated: expected core file size >= 757334016, found: 5304320.
>>> Missing separate debuginfo for the main executable file
>>> Try: zypper install -C
>>> "debuginfo(build-id)=81f375e003a6b0d50c77cf51d91f35f3526920fb"
>>> [New Thread 5677]
>>> [New Thread 5488]
>>> Failed to read a valid object file image from memory.
>>> Core was generated by `/usr/lib64/R/bin/exec/R'.
>>> Program terminated with signal 11, Segmentation fault.
>>> #0  0x00007fcf9d3fda13 in ?? ()
>>>
>>> ----------------------------------------------------------------------------------------------------
>>>
>>> Kind Regards,
>>> Piet Jones
>>>
>>> _______________________________________________
>>> Bioconductor mailing list
>>> Bioconductor at r-project.org
>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor