[R] Obscure bug....?

Kjetil Kjernsmo kjetil.kjernsmo at astro.uio.no
Tue Apr 4 16:05:18 CEST 2000


Dear all,

I've been struggling for days now with a piece of code that I have posted
here before, that has a really obscure bug. I think I may have isolated
it, but I have no idea what it is.... It might also be a bug in R I
guess, as it seems that one or several of list elements are not passed
when a function is called, but quite rarely. 
I have been hacking rather wildly on the histogram function in R, it might
have something to do with that, but I don't think it has. I hate to ask
people to find my bugs, but I hope the many-eyeballs-phenomenon will
strike again...

The really strange thing is that sometimes my run goes all the way without
any problems at all, the output looks beautiful and everything is great,
sometimes I get a few warnings, and sometimes the whole thing crashes. One
doesn't even have to do anything between the runs, no change of
parameters, all the stochastic process has been simulated with a different
program, and crashes happen at different places, with different error
messages. To start somewhere, I have started on the warnings.

The two most relevant functions are the following. All the "print"-calls
are inserted for debugging:
  
add.histogram <- function(h1, h2)
{
  print(c(test=0,lbs1=length(h1$breaks),lbs2=length(h2$breaks)))
  if(length(h1$breaks) >= length(h2$breaks))
  {
    bl <- h1$breaks
    bs <- h2$breaks
    cl <- h1$counts
    cs <- h2$counts
    mi <- h1$mids

print(c(test=1,lbs=length(bs),lcs=length(cs),lbl=length(bl),lcl=length(cl)))
  } else {
    bs <- h1$breaks
    bl <- h2$breaks
    cs <- h1$counts
    cl <- h2$counts
    mi <- h2$mids

print(c(test=2,lbs=length(bs),lcs=length(cs),lbl=length(bl),lcl=length(cl)))
  }
  ind <- sub.vector(bs, bl)
  if(! is.vector(ind)) stop("Incompatible breaks")
  c0 <- rep(0, length(cl))
  print(c(test=1.1, lc0=length(c0), lcl=length(cl), lcs=length(cs)))
  c0[ind[1:length(cs)]] <- cs
  print(c(test=1.2, lc0=length(c0), lcl=length(cl), lcs=length(cs)))
  ct <- c0 + cl
  print(c(test=1.3, lc0=length(c0), lcl=length(cl), lcs=length(cs)))
  int <- ct/(sum(ct)*diff(bl)) # This works for non-equidistant breaks?

print(c(test=7,lbs=length(bs),lc=length(cs),lct=length(ct),lbl=length(bl),lcl=length(cl)))
  return(structure(list(breaks=bl,
                        counts=ct,
                        intensities=int,
                        mids=mi),
                   class="histogram"))
}

summaryidmaps <- function(path, files, quiet=F, ...)
{
  filearr <- list.files(path, files, full.names=T)
  if(! quiet)
  {
    print(noquote("Files found:"))
    print(noquote(filearr))       
  }
  tmp <- summarymap(filearr[1], ...)
  h <- tmp$histogram
  sm <- tmp$mapmean
  sv <- tmp$mapvar
  mmin <- tmp$mapmin
  mmax <- tmp$mapmax
  seed <- tmp$mapseed
  nfiles <- length(filearr)
  for (i in 2:nfiles) # Supposedly, reduced efficiency by using a for
  {                   # loop is neglible compared to file access time. 
    print(c(test=9,hb=length(h$breaks), hc=length(h$counts)))
    tmp <- summarymap(filearr[i], ...)
    print(c(test=4,hb=length(h$breaks), hc=length(h$counts)))
    h <- add.histogram(h, tmp$histogram)
    print(c(test=5,hb=length(h$breaks), hc=length(h$counts)))
    sm <- c(sm, tmp$mapmean)
    sv <- c(sv, tmp$mapvar)
    seed <- c(seed, tmp$mapseed)
    mmin <- min(c(mmin, tmp$mapmin))
    mmax <- max(c(mmax, tmp$mapmax))
  }
  return(list(histogram=h,
              nfiles=nfiles,
              mapmin=mmin,
              mapmax=mmax,
              mapvars=sv,
              mapmeans=sm,
              mapseed=seed))
}

The first function is called by the latter, and this is where things most
often go wrong. Output from the print statements often say something like
test  lbs   lc  lct  lbl  lcl 
   7  910  909 5312 5313 5312 
test   hb   hc 
   5 5313    0 

Since bl is passed as breaks and ct as counts imidiately after the test=7
print, and test=5 is called just after the object has been returned, 
hb should here be equal to lbl (which it is, in this case), and lct should
be equal to hc, but hc is for some strange reason 0.... Of course, this
doesn't happen always, just once in a while, e.g. the previous sequence
says
test  lbs   lc  lct  lbl  lcl 
   7 1864 1863 5312 5313 5312 
test   hb   hc 
   5 5313 5312 
which is perfectly correct. 

Obviously, this problem affects all subsequent calculations done by the
program, and might be what eventually leads to the crash. Other runs
crashes in a different manner, but I think it is the same problem that
occurs, it is only that breaks aren't passed while counts are.

I have gathered together everything I think is needed to run this and put
it on <URL:http://www.astro.uio.no/~kjetikj/tmp/th/kRcrash.tar> (unpacks
in current directory), in that file is also a (somewhat edited) transcript
of a session where the code runs into trouble at different places.
However, since this code reads from a number of data files, these files
are probably needed to reproduce the bug. I have asked my sysadm if I can
get access to ftp/pub so that I can make these files available as well. 
 
I cross my fingers in the hope that this bug may be transparent
to somebody on this list. :-)

Best,

Kjetil
-- 
Kjetil Kjernsmo
Graduate astronomy-student                    Problems worthy of attack
University of Oslo, Norway            Prove their worth by hitting back
E-mail: kjetikj at astro.uio.no                                - Piet Hein
Homepage <URL:http://www.astro.uio.no/~kjetikj/>
Webmaster at skepsis.no 








-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._



More information about the R-help mailing list