[Rd] cat cannot write more than 10000 characters? [R 2.8.1]

Prof Brian Ripley ripley at stats.ox.ac.uk
Sun Jan 11 15:34:10 CET 2009


Looks like a bug in your iconv.  However, that section of code is 
conditionalized by

     if(con->outconv) { /* translate the buffer */

and I don't see that as non-NULL on my systems.  It should only be 
called when you specify an encoding on the output connection, so have 
you set an option (e.g. "encoding")  without telling us?

I was able to reproduce a similar problem by

cat(testChunk, sep = "\n", file = file("output", encoding="latin1"),
     append = TRUE)

in a UTF-8 locale, and I'll add a workaround to the R sources.

Please do run your tests with R --vanilla and make sure they are 
complete -- see the posting guide.


On Mon, 5 Jan 2009, Daniel Sabanés Bové wrote:

> Dear Prof. Ripley,
>>> I have discovered that my cat function cannot write more than 10000
>>> characters to a text file.

I think you meant *bytes*, BTW.

>> You mean on a single line?
> Yes. OOo tries to save space...
>> No, works for me on Mac OS X and x86_64 Fedora 8 (as does 10x larger).
>> Can you run this under a debugger and find where it is going wrong for
>> you?
> Oh, then this might be distribution- or gcc-version-specific:
> gcc --version
> gcc (SUSE Linux) 4.3.2 [gcc-4_3-branch revision 141291]
>
> glibc is version 2.9-2.3.
>
> Using ddd I found the (relevant part of the) backtrace when interrupting
> the infinite loop:
>
> (gdb) backtrace
> #0  __gconv (cd=0x846cde0, inbuf=0xbfff7738, inbufend=0x84ca589 "",
> outbuf=0xbfff773c, outbufend=0xbfff9e57 "", irreversible=0xbfff76a8) at
> gconv.c:80
>
> The program comes here more than 100 000 times... with outbuf and inbuf
> always being "\0".
>
> #1  0xb7b581e7 in iconv (cd=0x846cde0, inbuf=0xbfff7738,
> inbytesleft=0xbfff7734, outbuf=0xbfff773c, outbytesleft=0xbfff7730) at
> iconv.c:53
> [this is   result = __gconv (gcd, (const unsigned char **) inbuf,
>                        (const unsigned char *)  (*inbuf + *inbytesleft),
>                          (unsigned char **) outbuf,
>                           (unsigned char *) (*outbuf + *outbytesleft),
>                       &irreversible);]
>
> #2  0xb7e44d29 in Riconv (cd=0x846cde0, inbuf=0xbfff7738,
> inbytesleft=0xbfff7734, outbuf=0xbfff773c, outbytesleft=0xbfff7730) at
> sysutils.c:692
> [ this is the only line of Riconv,  return iconv((iconv_t) cd,
> (ICONV_CONST char **) inbuf, inbytesleft, outbuf, outbytesleft);]
>
> #3  0xb7d2c337 in dummy_vfprintf (con=0x8400bb0, format=0xb7ee0c48 "%s",
> ap=0xbfffc604 "\230¾L\b°?\005\b¬h\a\b¬h\a\b°?\005\b°?\005\b\001") at
> connections.c:316
> [this is      ires = Riconv(con->outconv, &ib, &inb, &ob, &onb);]
>
> The infinite loop seems to be inside dummy_vfprintf, as this position is
> the "highest" inside the backtrace which is reached again and again. And
> at line 249 appears the magic number 10000 as BUFSIZE, which is indeed
> selected by the preprocessor in my environment!
>
> #4  0xb7d2c4fa in file_vfprintf (con=0x8400bb0, format=0xb7ee0c48 "%s",
> ap=0xbfffc604 "\230¾L\b°?\005\b¬h\a\b¬h\a\b°?\005\b°?\005\b\001") at
> connections.c:579
> [this is  if(con->outconv) return dummy_vfprintf(con, format, ap);]
>
> This and everything above is only reached once, so this might be OK.
>
> #5  0xb7dfe069 in Rvprintf (format=0xb7ee0c48 "%s", arg=0xbfffc604
> "\230¾L\b°?\005\b¬h\a\b¬h\a\b°?\005\b°?\005\b\001") at printutils.c:785
> [this is   (con->vfprintf)(con, format, argcopy);]
>
> #6  0xb7dfe244 in Rprintf (format=0xb7ee0c48 "%s") at printutils.c:679
> [this is   Rvprintf(format, ap);]
>
> #7  0xb7d0446c in do_cat (call=0x83032a8, op=0x806b7d4, args=<value
> optimized out>, rho=0x830359c) at builtin.c:597
> [this is   Rprintf("%s", p);]
>
> Unfortunately, I'm not experienced in R/C code internals, but if you
> have detailed instructions for me (like "show me the value of this
> variable after 10000 stops") I can provide more debugging info.
>>> cat(testChunk, sep = "\n", file = output, append = TRUE)
>> We have writeLines() for that and it is more efficient, especially if
>> you keep a connection open.
> OK, maybe Prof. Leisch wants to improve the Sweave code...?
>
> Thank you very much for your help,
> best regards,
> Daniel Sabanes
>

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595


More information about the R-devel mailing list