[Rd] Line splitting in system() (PR#6624)

Peter Dalgaard p.dalgaard at biostat.ku.dk
Sat Feb 28 17:16:51 MET 2004


mjw at celos.net writes:

> According to the manual, system() splits output lines into
> 8096-char chunks; under UNIX, actually seems to return 8094
> chars, and drop the 8095th.  Spot missing digits in:
> 
>   x2 <- 
>     system("perl -e 'print \"0123456789\"x10000'",
>     intern=T)
> 
> Looks like a bug in the code to remove newlines at
> src/unix/sys-unix.c:218 -- fgets() reads size-1 characters
> and adds null, so strlen(buf)<size always true.  Testing for
> '\n' explicitly is probably better (deals with 8094 chr + \n
> case) -- it turns out the win32 code already does this
> anyway.  (IIRC the read>0 condition in the win32 code would
> be redundant but I copied it anyway to be safe.)
> 
> Anyway, rather trivial diff below.  Both manpages should
> probably say 8095 rather than 8096, I think.

Confirmed for R-devel too. Thanks for the fix, will apply in due
course. Notice that the same fix handles the case where the final line
is not \n-terminated:

> nchar(x2)
 [1] 8094 8094 8094 8094 8094 8094 8094 8094 8094 8094 8094 8094 2859
> sum(nchar(x2))
[1] 99987
> length(nchar(x2))
[1] 13

I.e. we're losing a character in every block, including the last,
short, one.

-- 
   O__  ---- Peter Dalgaard             Blegdamsvej 3  
  c/ /'_ --- Dept. of Biostatistics     2200 Cph. N   
 (*) \(*) -- University of Copenhagen   Denmark      Ph: (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)             FAX: (+45) 35327907



More information about the R-devel mailing list