[Rd] strange behaviour when converting from char to POSIX

Prof Brian Ripley ripley at stats.ox.ac.uk
Sun Jan 11 21:23:17 MET 2004


On Sun, 11 Jan 2004 ripley at stats.ox.ac.uk wrote:

> On Sun, 11 Jan 2004, Dirk Eddelbuettel wrote:
> > I was just mucking about with that, but R (1.9.0 as of Jan 8, 2004, on
> > Debian unstable) still crashes reliably even when I explicitly set the TZ
> > variable (and it also crashed for TZ=GMT):
> > 
> > > Sys.getenv("TZ")
> > TZ
> > ""
> > > Sys.putenv("TZ"="CDT6CST")
> > > Sys.getenv("TZ")
> >        TZ
> > "CDT6CST"
> > > format(strptime("199308070150","%Y%m%d%H%M"), "%Y-%m-%d %H:%M:%S",
> > tz="GMT", usetz=TRUE)
> > [1] "1993-08-07 01:50:00"
> > > format(strptime("199308070150","%Y%m%d%H%M"), "%Y-%m-%d %H:%M:%S %Z",
> > tz="GMT", usetz=TRUE)
> > Segmentation fault
> > 
> > 
> > Not nice.
> 
> And the bug is in glibc, not R (the segfault is in strftime in libc).  
> It works perfectly on Solaris:
> 
> [1] "1993-08-07 01:50:00 CST"
> 
> and on Windows (where that is not a valid time zone so I used mine).
> 
> [1] "1993-08-07 01:50:00 GMT Daylight Time"
> 
> Presumably glibc has some undocumented assumption that we are not 
> fulfilling, but I am by now very tired of the bugs in its date-time code.

I've found the glibc bug.  Looking at the gdb output

(gdb) print tm
$1 = {tm_sec = 0, tm_min = 50, tm_hour = 1, tm_mday = 7, tm_mon = 7,
  tm_year = 93, tm_wday = 6, tm_yday = 218, tm_isdst = 1,
  __tm_gmtoff = 138337816, __tm_zone = 0x1 <Address 0x1 out of bounds>}

Now __tm_zone is not a POSIX field, and it is not documented in time.h 
either.  Yet strftime.c in glibc 2.3.2 includes

  zone = NULL;
#if HAVE_TM_ZONE
  /* The POSIX test suite assumes that setting
     the environment variable TZ to a new value before calling strftime()
     will influence the result (the %Z format) even if the information in
     TP is computed with a totally different time zone.
     This is bogus: though POSIX allows bad behavior like this,
     POSIX does not require it.  Do the right thing instead.  */
  zone = (const char *) tp->tm_zone;
#endif
#if HAVE_TZNAME
  if (ut)
    {
      if (! (zone && *zone))
        zone = "GMT";
    }

so it looks at the field even though there is no reason why it should be 
set according to POSIX or ISO C. If one NULLs it, the code behaves 
correctly.  Note that strptime in the same version of glibc does not set 
it, hence the problem.

The extra field seems to have been around since 1996-09-11, so I think a
test for glibc >= 2.0 is safe here.

Brian

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595



More information about the R-devel mailing list