[R] Regexp: extract first occurrence of date in string

johannes rara johannesraja at gmail.com
Sat Jan 2 16:57:19 CET 2010


Thanks for the hint, ie. something like this works in this case:

> txt <- "first date is 05.12.2009. Second date is 06.12.2009."
> txt
[1] "first date is 05.12.2009. Second date is 06.12.2009."
> l <- regexpr("\\d{1,2}\\.\\d{1,2}\\.\\d{4}", txt, perl=T)
> substr(txt, l, l+9)
[1] "05.12.2009"
>

But your examples are more generic. I'll have to look gsubfn more closely.

-J

2010/1/2 Gabor Grothendieck <ggrothendieck at gmail.com>:
> Use regexpr to get the offset into the string and its length and then
> use substr to pick extract it.
>
> On Sat, Jan 2, 2010 at 10:43 AM, johannes rara <johannesraja at gmail.com> wrote:
>> Thanks, is the same possible using basic gsub/sub/grep etc. functions?
>>
>> -J
>>
>> 2010/1/2 Gabor Grothendieck <ggrothendieck at gmail.com>:
>>> Try this which uses a slightly simpler regexp:
>>>
>>>> library(gsubfn)
>>>> strapply(txt, "(\\d{1,2}\\.\\d{1,2}\\.\\d{4}).*")[[1]]
>>> [1] "05.12.2009"
>>>
>>> or we could convert it to Date class at the same time where we have
>>> assumed month.day.year:
>>>
>>>> strapply(txt, "(\\d{1,2}\\.\\d{1,2}\\.\\d{4}).*", ~ as.Date(x, "%m.%d.%Y"))[[1]]
>>> [1] "2009-05-12"
>>>
>>> or this even simpler regexp extracting all the dates and then picking
>>> off the first:
>>>
>>>> strapply(txt, "\\d{1,2}\\.\\d{1,2}\\.\\d{4}")[[1]][1]
>>> [1] "05.12.2009"
>>>
>>> On Sat, Jan 2, 2010 at 10:08 AM, johannes rara <johannesraja at gmail.com> wrote:
>>>> I would like to extract first date from a string:
>>>>
>>>>> txt <- "first date is 05.12.2009. Second date is 06.12.2009."
>>>>> txt
>>>> [1] "first date is 05.12.2009. Second date is 06.12.2009."
>>>>
>>>> I tried:
>>>>
>>>>> sub("^.*?\\s(\\d{1,2}\\.\\d{1,2}\\.\\d{4})", "\\1", txt, extended=T, perl=T)
>>>> [1] "05.12.2009. Second date is 06.12.2009."
>>>>>
>>>>
>>>> How to modify this?
>>>>
>>>> -J
>>>>
>>>> ______________________________________________
>>>> R-help at r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>
>>>
>>
>



More information about the R-help mailing list