[R] Counting enumerated items in each element of a character vector

Ista Zahn istazahn at gmail.com
Wed Apr 26 05:47:19 CEST 2017


stringr::str_count (and stringi::stri_count that it wraps) interpret
the pattern argument as a regular expression by default.

Best,
Ista

On Tue, Apr 25, 2017 at 11:40 PM, Michael Hannon
<jmhannon.ucdavis at gmail.com> wrote:
> I like Boris's "Hadley" solution.  For the record, I've appended a
> version that uses regular expressions, the only benefit of which is
> that it could be generalized to find more-complicated patterns.
>
> -- Mike
>
> counts <- sapply(text1, function(next_string) {
>     loc_example <- length(gregexpr("Example", next_string)[[1]])
>     loc_example
> }, USE.NAMES=FALSE)
>
>> counts
> [1] 5 5 5 5
>>
>
> On Tue, Apr 25, 2017 at 5:33 PM, Boris Steipe <boris.steipe at utoronto.ca> wrote:
>> I should add: there's a str_count() function in the stringr package.
>>
>> library(stringr)
>> str_count(text1, "Example")
>> # [1] 5 5 5 5
>>
>> I guess that would be the neater solution.
>>
>> B.
>>
>>
>>
>>> On Apr 25, 2017, at 8:23 PM, Boris Steipe <boris.steipe at utoronto.ca> wrote:
>>>
>>> How about:
>>>
>>> unlist(lapply(strsplit(text1, "Example"), function(x) { length(x) - 1 } ))
>>>
>>>
>>> Splitting your string on the five "Examples" in each gives six elements. length(x) - 1 is the number of
>>> matches. You can use any regex instead of "example" if you need to tweak what you are looking for.
>>>
>>>
>>> B.
>>>
>>>
>>>
>>>
>>>> On Apr 25, 2017, at 8:14 PM, Dan Abner <dan.abner99 at gmail.com> wrote:
>>>>
>>>> Hi all,
>>>>
>>>> I am looking for a streamlined way of counting the number of enumerated
>>>> items are each element of a character vector. For example:
>>>>
>>>>
>>>> text1<-c("This is an example.
>>>> List 1
>>>> 1) Example 1
>>>> 2) Example 2
>>>> 10) Example 10
>>>> List 2
>>>> 1) Example 1
>>>> 2) Example 2
>>>> These have been examples.","This is another example.
>>>> List 1
>>>> 1. Example 1
>>>> 2. Example 2
>>>> 10. Example 10
>>>> List 2
>>>> 1. Example 1
>>>> 2. Example 2
>>>> These have been examples.","This is a third example. List 1 1) Example 1.
>>>> 2) Example 2. 10) Example 10. List 2 1) Example 1. 2) Example 2. These have
>>>> been examples."
>>>> ,"This is a fourth example. List 1 1. Example 1. 2. Example 2. 10. Example
>>>> 10. List 2 Example 1. 2. Example 2. These have been examples.")
>>>>
>>>> text1
>>>>
>>>> ===
>>>>
>>>> I would like the result to be c(5,5,5,5). Notice that sometimes there are
>>>> leading hard returns, other times not. Sometimes are there separate lists
>>>> and the same numbers are used in the enumerated items multiple times within
>>>> each character string. Sometimes the leading numbers for the enumerated
>>>> items exceed single digits. Notice that the delimiter may be ) or a period
>>>> (.). If the delimiter is a period and there are hard returns (example 2),
>>>> then I expect that will be easy enough to differentiate sentences ending
>>>> with a number from enumerated items. However, I imagine it would be much
>>>> more difficult to differentiate the two for example 4.
>>>>
>>>> Any suggestions are appreciated.
>>>>
>>>> Best,
>>>>
>>>> Dan
>>>>
>>>>      [[alternative HTML version deleted]]
>>>>
>>>> ______________________________________________
>>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>
>> ______________________________________________
>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list