[Rd] (PR#8777) strsplit does [not] return correct value when spliting ""

Peter Ehlers ehlers at math.ucalgary.ca
Mon Apr 17 23:02:43 CEST 2006


Charles,

Can't you achieve your goal by unlist()ing 'substrings'?

  max(nchar(unlist(substrings)))

Peter Ehlers


Charles Dupont wrote:
> Now using R 2.3.0.
> 
> I have a string that can be "".  I want to find the max screen width of 
> the all the lines in the string. so I run the command
> 
>   > x <- c("hello", "bob is\ngreat", "foo", "", "bar")
>   > substrings <- strsplit(x, "\n"), type="width")
>   > sapply(substrings, FUN=function(x) max(nchar(x, type="width")))
> which returns
> [1]    5    6    3 -Inf    3
> 
> This happens because of the behavior of strsplit for a string that is not ""
>   > strsplit("Hello\nBob", "\n")
> 
> it returns
> [[1]]
> [1] "Hello" "Bob"
> 
> 
> for a string that is ""
>   > strsplit("", "\n")
> 
> it returns
> [[1]]
> character(0)
> 
> 
> I would expect
> [[1]]
> [1] ""
> 
> because "" is character vector of length 1 containing a string of length 
> 0, not a character vector of length 0.
> 
> For any other string if the split string is not matched in argument x 
> then it returns the original string x.
> 
> The man page states in the value section that strsplit returns:
>       A list of length 'length(x)' the 'i'-th element of which contains
>       the vector of splits of 'x[i]'.
> 
> It mentions no change in behavior if the value of x[i] = "".
> 
> Prof Brian Ripley wrote:
> 
>>Please use a current version of R: we are at 2.3.0RC (and we do ask you 
>>not to report on obselete versions).
>>
>>What rule are you using, and where did you find it in the R documentation?
>>
>>In fact
>>
>>
>>>strsplit("", " ")
>>
>>[[1]]
>>character(0)
>>
>>which is not as you stated.   This is a feature, as it distinct from
>>
>>
>>>strsplit(" ", " ")
>>
>>[[1]]
>>[1] ""
>>
>>Consider also
>>
>>
>>>strsplit("", "")
>>
>>[[1]]
>>character(0)
>>
>>
>>>strsplit("a", "")
>>
>>[[1]]
>>[1] "a"
>>
>>
>>>strsplit("ab", "")
>>
>>[[1]]
>>[1] "a" "b"
>>
>>
>>On Mon, 17 Apr 2006, charles.dupont at vanderbilt.edu wrote:
>>
>>
>>>Full_Name: Charles Dupont
>>>Version: 2.2.0
>>>OS: linux
>>>Submission from: (NULL) (160.129.129.136)
>>>
>>>
>>>when
>>>
>>>strsplit("", " ")
>>>
>>>returns character(0)
>>>
>>>where as
>>>
>>>strsplit("a", " ")
>>>
>>>returns "a".
>>>
>>>these return values are not constiant with each other.
>>>
>>>Charles Dupont
>>>
>>>______________________________________________
>>>R-devel at r-project.org mailing list
>>>https://stat.ethz.ch/mailman/listinfo/r-devel
>>>
>>>
>>
> 
>



More information about the R-devel mailing list