[BioC] CNTools error message

Martin Morgan mtmorgan at fhcrc.org
Thu Jan 26 20:10:54 CET 2012


On 01/26/2012 09:56 AM, nathalie wrote:
> Hi,
> I am analysing CGH data using DNAcopy to create segmented data and
> CNTools package to reduce the segments by genes.
> No issue there.
>
> #I created the segmentList object using segment function of DNAcopy
> segmentList<-segment(smooth.CNA_order1.5,undo.splits="sdundo",undo.SD=1.5,verbose=1)
>
> mySeglist1.5<-segmentList[["output"]]
> #then I create a CNSeg object
> require(CNTools)
> cnseg1.5=CNSeg(mySeglist1.5)
> # use a geneInfoMouse list I have created, this is the format: chrom
> start end geneid genename
> head(geneInfoMouse)
> chrom start end geneid genename
> 1 14 53989971 53990250 ENSMUSG00000076824 X76971
> 2 14 53047603 53048549 ENSMUSG00000076758 Gm13987
> 3 14 53344974 53346592 ENSMUSG00000076829 Gm13955
> 4 14 53363924 53364514 ENSMUSG00000076767 Gm13956
> 5 14 53373042 53373519 ENSMUSG00000076768 Gm701
> 6 14 53389428 53390065 ENSMUSG00000076769 Gm13959
> convertedData1.5G<-getRS(cnseg1.5, by ="gene", imput=FALSE, XY=TRUE,
> geneMap = geneinfo, what="mean")
> #success!
>
> #Then I want to reduce the segment using the probe ID and location from
> agilent (AgilentProbesMouse), I have created the same type of
> geneInfoMouse file using the same header:chrom start end geneid genename
> head(AgilentProbesMouse)
> chrom start end geneid genename
> 1 1 3002738 3002797 A_67_P04000012 Unknown
> 2 1 3017822 3017881 A_67_P04000019 Unknown
> 3 1 3027731 3027790 A_67_P04000027 Unknown
> 4 1 3036221 3036280 A_67_P04000032 Unknown
> 5 1 3053985 3054044 A_67_P04000058 Unknown
> 6 1 3063893 3063952 A_67_P04000065 gb|AK016604
>
> #the only difference is this file is much bigger 468k lines vs 36k for
> the geneInfoMouse

I guess that the values are different too, and this is a more likely 
source of the problem than the size.

> #If I then use the same function as before on my segment pbject with the
> probe file
> convertedData1.5P<-getRS(cnseg1.5, by ="gene", imput=FALSE, XY=TRUE,
> geneMap = AgilentProbesMouse, what="mean")
> #I got this error
> Error in FUN(X[[1L]], ...) : NA/NaN/Inf in foreign function call (arg 2)

I don't have a specific answer, but

(a) check that the arguments to the function are consistent with those 
described on the function help page.

(b) use traceback() (and report the result to the mailing list) to 
identify where the actual error is occurring, and the functions that 
were in effect when the error occurred.

(c) To further debug an error, say

   options(error=recover)

before you call a function that causes the error (use 
options(error=NULL) to return to the normal behavior). Then call your 
function. When the error occurs, you will be presented with a 'call 
stack' describing the functions in effect when the error occurred. 
Select the level closest to the error. You will enter the R environment 
at the time the error occurred. Look at the definition of the function 
being called, at the variables that are defined, etc, to identify the 
problem. Example (of course the real code might be more difficult)

f = function(a) {
    x = log(a)
    g(x)
}

g = function(x) {
     if (any(x < 0))
         stop("'x' must be >= 0")
}

 > options(error=recover)
 > f(.1)
Error in g(x) : 'x' must be >= 0

Enter a frame number, or 0 to exit

1: f(0.1)
2: g(x)

Selection: 2
Called from: f(0.1)
Browse[1]> ls()
[1] "x"
Browse[1]> x
[1] -2.302585
Browse[1]> g
function (x)
{
     if (x < 0)
         stop("'x' must be >= 0")
}
Browse[1]> c

Enter a frame number, or 0 to exit

1: f(0.1)
2: g(x)

Selection: 1
Called from: top level
Browse[1]> f
function (a)
{
     x = log(a)
     g(x)
}
Browse[1]> ls()
[1] "a" "x"
Browse[1]> f
function (a)
{
     x = log(a)
     g(x)
}
Browse[1]> a
[1] 0.1
Browse[1]> x
[1] -2.302585

This will at the least help to produce a reduced data set that 
illustrates the problem.

From

 > Error in FUN(X[[1L]], ...) : NA/NaN/Inf in foreign function call (arg 2)

I would guess this is actually in an lapply or sapply (the FUN 
argument), calling the first argument of a list ('X[[1L]]'), that FUN 
(to be determined by inspection of the functions in the call stack) has 
a line that says .C(<...>); one of the arguments to .C contains NA / NaN 
Inf, and these were either present in the initial data or introduced by 
some operation leading up to the .C statement.

(d) Taking a different tack, I looked at

 > getRS
standardGeneric for "getRS" defined from package "CNTools"

function (object, by = c("region", "gene", "pair"), imput = TRUE,
     XY = FALSE, geneMap, what = c("mean", "max", "mini", "median"),
     mapChrom = "chrom", mapStart = "start", mapEnd = "end")
standardGeneric("getRS")
<environment: 0x2b056b8>
Methods may be defined for arguments: object, by, imput, XY, geneMap, 
what, mapChrom, mapStart, mapEnd
Use  showMethods("getRS")  for currently available ones.

Since this is an S4 generic, I asked to see the methods with their 
definitions

 > showMethods(getRS, includeDefs=TRUE)
Function: getRS (package CNTools)
object="CNSeg"
function (object, by = c("region", "gene", "pair"), imput = TRUE,
     XY = FALSE, geneMap, what = c("mean", "max", "mini", "median"),
     mapChrom = "chrom", mapStart = "start", mapEnd = "end")
seg2RS(object, by, imput, XY, geneMap, what = what, mapChrom = mapChrom,
     mapStart = mapStart, mapEnd = mapEnd)

Only 1 method, calls seg2RS

 > seg2RS
Error: object 'seg2RS' not found

looks like it is not an exported function; likely in the CNTools name 
space. Looking at the function definition I see what looks like the most 
promising line in the 'switch' statement

 > CNTools:::seg2RS
function (segData, by = c("region", "gene", "pair"), imput = TRUE,
     XY = FALSE, geneMap, what = c("mean", "median", "max", "min"),
     mapChrom = "chrom", mapStart = "start", mapEnd = "end")
{
[...]

     rs <- switch(by,
[...]
         gene = getReducedSeg(segList(segData), geneMap, what = what,
             segID = id(segData), segChrom = chromosome(segData),
             segStart = start(segData), segEnd = end(segData),
             segMean = segMean(segData), mapChrom = mapChrom,
             mapStart = mapStart, mapEnd = mapEnd), pair = 
getPairwise(segData,
             imput = imput, XY = XY, what = what))
[...]

so again

 > CNTools:::getReducedSeg
function (seglist, map, what = c("mean", "median", "max", "min"),
     segID = "ID", segChrom = "chrom", segStart = "loc.start",
     segEnd = "loc.end", segMean = "seg.mean", mapChrom = "chrom",
     mapStart = "start", mapEnd = "end")
{
     what <- match.arg(what)
     getGeneSegMean <- function(segData) {
         segged <- rep(0, nrow(map))
         return(.C("getratios", as.character(map[, mapChrom]),
             as.double(map[, mapStart]), as.double(map[, mapEnd]),
             as.integer(nrow(map)), as.character(segData[, segChrom]),
             as.double(segData[, segStart]), as.double(segData[,
                 segEnd]), as.integer(nrow(segData)), as.double(segData[,
                 segMean]), as.character(what), as.double(segged),
             PACKAGE = "CNTools")[[11]])
     }
     splited <- split.data.frame(seglist, factor(seglist[, segID]))
[...]

And I would bet that the error occurs when split.data.frame uses the 
first level of factor(seglist[,segID]) to call getGeneSegMean. This 
could be confirmed with the output of traceback()

Looking back over the function calls, seglist is the result of 
segList(cnseq1.5). If this is correct you can reproduce the error with 
just the following at the command line

     seqlist <- segList(cnseq1.5)
     getGeneSegMean <- function(segData) {
         segged <- rep(0, nrow(map))
         return(.C("getratios", as.character(map[, mapChrom]),
             as.double(map[, mapStart]), as.double(map[, mapEnd]),
             as.integer(nrow(map)), as.character(segData[, segChrom]),
             as.double(segData[, segStart]), as.double(segData[,
                 segEnd]), as.integer(nrow(segData)), as.double(segData[,
                 segMean]), as.character(what), as.double(segged),
             PACKAGE = "CNTools")[[11]])
     }
     splited <- split.data.frame(seglist, factor(seglist[, segID]))

If so, you can

    debug(getGeneSegMean)
    splited <- split.data.frame(seglist, factor(seglist[, segID]))

step into the function, and look at each of the arguments in .C in turn, 
until you spot the NA / NaN / Infinite value (e.g., using 
all(is.finite(map[, mapStart]))). You could also have arrived at this 
location in the code using the options(error=recover) case.

Hope that helps.

Martin

>
> #If I reduce my geneInfoMouse to 36K lines, it works, Is there a size
> limit for this getRS function and can I change this?
>
> I am sorry I can't attaching any files as there are too large.
> thanks for any advice, I guess I will start by\ spliting my probe file.
> Nat
>  > sessionInfo()
> R version 2.13.0 (2011-04-13)
> Platform: x86_64-unknown-linux-gnu (64-bit)
>
> locale:
> [1] LC_CTYPE=en_GB.UTF-8 LC_NUMERIC=C
> [3] LC_TIME=en_GB.UTF-8 LC_COLLATE=en_GB.UTF-8
> [5] LC_MONETARY=C LC_MESSAGES=C
> [7] LC_PAPER=en_GB.UTF-8 LC_NAME=C
> [9] LC_ADDRESS=C LC_TELEPHONE=C
> [11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C
>
> attached base packages:
> [1] tools stats graphics grDevices utils datasets methods
> [8] base
>
> other attached packages:
> [1] CNTools_1.6.0 genefilter_1.30.0
>
> loaded via a namespace (and not attached):
> [1] annotate_1.26.1 AnnotationDbi_1.10.1 Biobase_2.8.0
> [4] DBI_0.2-5 RSQLite_0.9-1 splines_2.13.0
> [7] survival_2.35-8 xtable_1.6-0
> reduced_bygene1.5=rs(convertedData1.5G)
> write.table(reduced_bygene1.5, file="reduced_bygene1.5.xls", sep="\t",
> row.names=F)
>
>
>
>


-- 
Computational Biology
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109

Location: M1-B861
Telephone: 206 667-2793



More information about the Bioconductor mailing list