[Rd] puzzled by cat() behaviour when argument '...' is a vector (and argument 'sep' contains "\n")

Steven McKinney smckinney at bccrc.ca
Thu Nov 6 01:45:58 CET 2008


> On 05/11/2008 5:47 PM, Peter Ruckdeschel wrote:
> > Hi r-devels,
> >
> > I am a bit puzzled by the behaviour of  cat() --- any help is
> > appreciated...

It appears to me that the elements of sep are just used as separators 
_between_ each of the objects comprising '...' handed to cat.  

If N objects are handed to cat, cat requires N-1 separator strings.
The default separator string is " " (space character).

Hence for
cat(rep("x",3), sep = ".")
two periods are needed to separate the three input objects

> cat(rep("x",3), sep = ".")
x.x.x
>
as expected.

For cat(rep("x",3),sep = c(".","\n",".")), the first separator
is a period, the second is a newline, and the third is not needed.
> cat(rep("x",3),sep = c(".","\n","."))
x.x
x
>
as expected.  The line feed inserted is expected, it is the
second element of the sep vector, so should appear between
the second and third objects, as it does.  The third element 
of sep is not needed, so is ignored.

Another example:

> cat(letters, sep = c(as.character(1:9), "\n"))
a1b2c3d4e5f6g7h8i9j
k1l2m3n4o5p6q7r8s9t
u1v2w3x4y5z
>
Again, as expected.

Slightly more complex

> paste("[", c(as.character(1:9), "\n"), "]", sep = "")
 [1] "[1]"  "[2]"  "[3]"  "[4]"  "[5]"  "[6]"  "[7]"  "[8]"  "[9]"  "[\n]"
> cat(letters, sep = paste("[", c(as.character(1:9), "\n"), "]", sep = ""))
a[1]b[2]c[3]d[4]e[5]f[6]g[7]h[8]i[9]j[
]k[1]l[2]m[3]n[4]o[5]p[6]q[7]r[8]s[9]t[
]u[1]v[2]w[3]x[4]y[5]z
> 

again, as expected.

I haven't delved into the source to see where the final line feed
is being generated (as I see the next R prompt on a new line) so
I can't comment on whether anything is appended to the end of the
output string generated by cat().  The documentation says no line
feed is appended unless argument 'fill' is TRUE or numeric.

> >
> > At least AFAICS, cat() for vector-valued '...' argument behaves in
> > contradiction to what I understand from the note in the help to cat()
> > which reads
> >
> > "
> > Despite its name and earlier documentation, 'sep' is a vector of
> >      terminators rather than separators, being output after every
> >      vector element (including the last).  Entries are recycled as
> >      needed.
> > "
> 
> I think you're right that the documentation is incorrect.  I'd prefer a
> patch to the docs, rather than a change to the behaviour:  cat() is so
> fundamental that any changes to it would have wide ranging consequences.
> 
> If you want to study the code and draft a documentation patch, I'll
> review it and possibly commit it.

How about this:

sep    a character vector of strings to insert between each object.  If 
       there are too few elements in sep to separate all the objects, 
       the elements of sep are recycled.  Unused elements of sep are ignored.


then in Details:

Details

cat is useful for producing output in user-defined functions. It
converts its arguments to character vectors, concatenates them to a
single character vector, inserts the given sep= string(s) between each
element and then outputs them.

> 
> Duncan Murdoch
> 
> > ----------------------------------------------------------------------------
> > reproducible example code:
> > ----------------------------------------------------------------------------
> >
> >> cat(rep("x",3), sep = ".")
> > x.x.x
> > ## no "." appended!
> >
> > Things get even worse if "\n" features in the 'sep' vector:
> >
> >> cat(rep("x",3),sep = c(".","\n","."))
> > x.x
> > x
> > ## last separator "."  gets swallowed; an non-intended line feed is
> > inserted
> >
> > ----------------------------------------------------------------------------
> > code causing this behaviour
> > ----------------------------------------------------------------------------
> > ##### "\n"
> >
> > I have looked a bit into the source code
> >         (lines 468-630 in builtin.c in src/main)
> > and found out, as variable pwidth is set to 1 in line 504, i.e.;
> >
> >    if (strstr(CHAR(STRING_ELT(sepr, i)), "\n")) nlsep = 1; /* ASCII */
> >
> > the code in lines 622-23, i.e.;
> >    
> >   if ((pwidth != INT_MAX) || nlsep)
> >        Rprintf("\n");
> >
> > is responsible for the newline. Is this really intended?
> >
> > ##### separators, not terminators
> >
> > Another look shows that, contrary to what is said in the help file,
> > an element of vector 'sep' is /not/ printed out after each element
> > of the vector passed as argument '...' to cat(), "including the last"
> > --- confer the for-loop over the elements of '...' in lines 596-617
> > and the print-out of the separator
> >
> >   cat_printsep(sepr, ntot);
> >
> > in line 600. Once again: Is this intended?
> >
> > A patch fixing my problem would be easy, though might crash
> > other much more important code; would you have any
> > proposals?
> >
> > Best,
> > Peter
> >
> > -------------------------------------------------------------------
> > Version:
> >  platform = i386-pc-mingw32
> >  arch = i386
> >  os = mingw32
> >  system = i386, mingw32
> >  status = Under development (unstable)
> >  major = 2
> >  minor = 9.0
> >  year = 2008
> >  month = 10
> >  day = 01
> >  svn rev = 46589
> >  language = R
> >  version.string = R version 2.9.0 Under development (unstable)
> > (2008-10-01 r46589)
> >
> > Windows XP (build 2600) Service Pack 3
> >
> > Locale:
> > LC_COLLATE=German_Germany.1252;LC_CTYPE=German_Germany.1252;LC_MONETARY=German_Germany.1252;LC_NUMERIC=C;LC_TIME=German_Germany.1252
> >
> > Search Path:
> >  .GlobalEnv, package:stats, package:graphics, package:grDevices,
> > package:utils, package:datasets, package:methods, Autoloads, package:base
> >
> > ______________________________________________
> > R-devel at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
> 
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

Steven McKinney

Statistician
Molecular Oncology and Breast Cancer Program
British Columbia Cancer Research Centre

email: smckinney +at+ bccrc +dot+ ca

tel: 604-675-8000 x7561

BCCRC
Molecular Oncology
675 West 10th Ave, Floor 4
Vancouver B.C. 
V5Z 1L3
Canada



More information about the R-devel mailing list