[Rd] strsplit() and final empty values

Henrik Bengtsson henr|k@bengt@@on @end|ng |rom gm@||@com
Thu Mar 3 05:13:00 CET 2022


Here's an example clarifying the issue:

> strsplit("a:b:c:d", split = ":", fixed = TRUE)
[[1]]
[1] "a" "b" "c" "d"

> strsplit("a:b:c:", split = ":", fixed = TRUE)
[[1]]
[1] "a" "b" "c"

I also ran into this a few times, and I agree that this complicated
things when you need to preserve that last empty element.  Instead of
changing the default behavior, which would probably break lots of
existing code relying on it, one could introduce a new,
backward-compatible argument `drop = TRUE`, e.g.

> strsplit("a:b:c:", split = ":", fixed = TRUE, drop = FALSE)
[[1]]
[1] "a" "b" "c" ""

My $.02

/Henrik



On Sat, Feb 26, 2022 at 6:39 AM Dzmitry Batrakou <d.batrakou using gmail.com> wrote:
>
> Hello,
>
> I would like to suggest changing the behaviour of the strsplit() function
> with multiple trailing empty values. Currently, `strsplit(x = 'value::',
> split = ':')` produces a list of length 2 ('value',''). This behaviour is
> documented in the manual (penultimate example), however, I would argue, is
> illogical and can lead to unexpected parsing results. One example is
> splitting delimited value strings into a table.
>
> Regards,
> Dzmitry
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-devel using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel



More information about the R-devel mailing list