[R] Sweave & xtable [problem solved/workaround -> bug in xtable or textConnection?]

Hedderik van Rijn hedderik at cmu.edu
Sun Dec 22 21:14:03 CET 2002


> Connections do correctly give you a warning if the internal line limit 
> is
> exceeded.  This is docuemnted in the source code, which is there for 
> you
> to read.

I know, discovered that it did give warning if the string is sink'ed 
directly, instead of going through xtable (and which was illustrated 
with the examples in the previous email). After some more explorations, 
it seems to be caused by cat'ing instead of print'ing a long string to 
a textConnection using sink.

This code triggers a warning: (Same behavior of course if the 
paste(...) is explicitly embedded in a print(...) statement)

con <- textConnection("output","w");
sink(file=con);
paste(rep("123456789!",1000),collapse="")
## Warning message:
## line truncated in output text connection
sink()
close(con)

Whereas this code snippet "silently" truncates the string, without 
warning:

con <- textConnection("output","w");
sink(file=con);
cat(paste(rep("123456789!",1000),collapse=""))
sink()
close(con)

I'm not sure which function (if any) to blame, but I definitely think 
that either cat or textConnection should have made sure that a warning 
"came through". As you mentioned, it is naive to assume an arbitrary 
line-length, but if the above code is not incorrect, my opinion is that 
it should warn users of incorrect output, or state it in the help pages.

> Looks to me as if text connections are being used where anonymous file
> connections would be much more appropriate.

Indeed, changing Sweave's RweaveLatexRuncode (line 1596 of tools, R 
1.6.1) to use file/readLines instead of textConnection:

         ## HvR replaced: tmpcon <- textConnection("output", "w")
         tmpcon <- file()
         sink(file=tmpcon)
         err <- NULL
         if(options$eval) err <- RweaveEvalWithOpt(ce, options)
         ## HvR added (make sure the final line is complete (with final 
EOL marker):
         cat("\n")
         sink()
         ## HvR added:
         output <- readLines(tmpcon)
         close(tmpcon)

solves the truncation problem for the Sweave/xtable/cat combination.

> [...]
>
>> If the truncation of long strings is official/known behavior of
>> textConnection, the following text in textConnection's help page might
>> need some revision, i.e., some more explicit statement that long
>> strings might get truncated. (And, maybe also a definition of what a
>> "completed line of output" is, i.e., ending in a "\n".)
>
> Whatever else could it mean?  I doubt if the end user would know what 
> \n
> means if (s)he is so naive as not to know what an incomplete line is!

While trying to figure out how to use anonymous file connections, I 
come acros the following reference in the readLines help page:

      If the final line is incomplete (no final EOL marker) the
      behaviour depends on whether the connection is blocking or not.

I guess the addition of "(no final EOL marker)" would at least for me 
be a useful extension to the textConnection help page.
After having spend a couple of hours with textConnections and other 
redirections, and knowing more about how they work, I certainly see 
your point. However, it might have saved me some initial confusion if 
this would have been there in the first place.

At the same time, it might be valuable to add an explicit reference to 
file() (besides the "See also: connections"), stating something along 
the line of the combination file/readLines being preferred over 
textConnection if the purpose is to process large chunks of output. (If 
I gathered correctly from your remarks that anonymous file connections 
are more appropriate in these situations.)

Thanks for the valuable comments, again, I learned a lot.

   - Hedderik.




More information about the R-help mailing list