[Rd] puzzled by cat() behaviour when argument '...' is a vector (and argument 'sep' contains "\n")

Wacek Kusnierczyk Waclaw.Marcin.Kusnierczyk at idi.ntnu.no
Thu Nov 6 10:44:41 CET 2008


William Dunlap wrote:
>
> For what it is worth, S+ and R differ in how a vector sep argument
> is treated.  R uses the vector sep only if cat() is given a vector of
> strings for the ... argument but S+ treats cat(c(...),sep=sep)
> the same as cat(..., sep=sep).  E.g., the following cat(...,sep=sep)
> uses all of sep in S+ but only sep[1] in R:
>    > cat("One.", "Two words.", "Three more words.",
> sep=c("<end1><start2>", "<end2>\n<start3>", "<end3>"))
>    S+: One.<end1><start2>Two words.<end2>
>    S+: <start3>Three more words.
>    R : One.<end1><start2>Two words.<end1><start2>Three more words.
> but the following cat(c(...),sep=sep) uses all of sep in both
>    > cat(c("One.", "Two words.", "Three more words."),
> sep=c("<end1><start2>", "<end2>\n<start3>", "<end3>"))
>    S+: One.<end1><start2>Two words.<end2>
>    S+: <start3>Three more words.
>    R : One.<end1><start2>Two words.<end2>
>    R : <start3>Three more words.
> If there are several c(...) entries in cat() it gets more complicated.
> E.g.,
>    > cat(c("One.", "Two words.", "Three more words."), c("Fourth
> entry.", "No. 5."),
>           sep=c("<end1><start2>", "<end2>\n<start3>", "<end3>"))
>    S+: One.<end1><start2>Two words.<end2>
>    S+: <start3>Three more words.<end3>Fourth entry.<end1><start2>No. 5.
>    R : One.<end1><start2>Two words.<end2>
>    R : <start3>Three more words.<end1><start2>Fourth
> entry.<end1><start2>No. 5.
>
>   

cat ues consecutive elements of sep circularly (see below) to separate
elements *within* each item in ..., and uses the first element in sep as
a separator *between* items in ...

cat(1:5, sep=letters[1:3])
# evaluates to "1a2b3c4a5"

cat(1:5, 6, sep=letters[1:3])
# evaluates to "1a2b3c4a5a6"

cat(1:5, 6, 7, sep=letters[1:3])
# evaluates to "1a2b3c4a5a6a7"

confusingly, the circulation over sep is *not* restarted for each item
in ..., and it is *not* stopped while separating between items in ...

cat(1:5, 6:10, sep=letters[1:3])
# evaluates to "1a2b3c4a5a6c7a8b9c10",
# not to "1a2b3c4a5a6a7b8c9a10" (which would be the case if circulation
were restarted for each item), and
# not to "1a2b3c4a5a6b7c8a9b10" (which would be the case if circulation
were not restarted, but were stopped for between-item separation)

cat(1:5, 6, 7:10, sep=letters[1:3])
# evaluates to "1ab3c4a5a6a7a8b9c10" (which is only *incidentally* the
same as it would be if circulation were restarted for each item), and
# not to "1a2b3c4a5a6a7b8c9a10" (which would be the case if circulation
were not restarted, but were stopped for between-item separation)


whether this is or is not reasonable, the man page should be explicit
about this behaviour, and the example section should clearly illustrate
it.  the examples above are a candidate.

i am not sure that there is any good scenario of where this behaviour
would be desirable and useful, and perhaps redesigning cat should be an
option to consider.

vQ


ps.  still using R 2.7.0, but i guess cat has not changed since.



More information about the R-devel mailing list