[R] sub/grep question: extract year

john matthew po|@@on200 @end|ng |rom goog|em@||@com
Thu Aug 9 11:40:12 CEST 2018


So there is probably a command that resets the capture variables as I call
them. No doubt someone will write what it is.

On 9 Aug 2018 10:36, "john matthew" <poisson200 using googlemail.com> wrote:

> Hi Marc.
> For question 1.
> I know in Perl that regular expressions when captured can be saved if not
> overwritten. \\1 is the capture variable in your R examples.
>
> So the 2nd regular expression does not match but \\1 still has 1980
> captured from the previous expression, hence the result.
>
> Maybe if you restart R and try your 2nd expression first, \\1 will be
> empty or no match result.
>
> Just speculation :)
>
> John
>
>
> On 9 Aug 2018 08:58, "Marc Girondot via R-help" <r-help using r-project.org>
> wrote:
>
>> Hi everybody,
>>
>> I have some questions about the way that sub is working. I hope that
>> someone has the answer:
>>
>> 1/ Why the second example does not return an empty string ? There is no
>> match.
>>
>> subtext <- "-1980-"
>> sub(".*(1980).*", "\\1", subtext) # return 1980
>> sub(".*(1981).*", "\\1", subtext) # return -1980-
>>
>> 2/ Based on sub documentation, it replaces the first occurence of a
>> pattern: why it does not return 1980 ?
>>
>> subtext <- " 1980 1981 "
>> sub(".*(198[01]).*", "\\1", subtext) # return 1981
>>
>> 3/ I want extract year from text; I use:
>>
>> subtext <- "bla 1980 bla"
>> sub(".*[ \\.\\(-]([12][01289][0-9][0-9])[ \\.\\)-].*", "\\1", subtext) #
>> return 1980
>> subtext <- "bla 2010 bla"
>> sub(".*[ \\.\\(-]([12][01289][0-9][0-9])[ \\.\\)-].*", "\\1", subtext) #
>> return 2010
>>
>> but
>>
>> subtext <- "bla 1010 bla"
>> sub(".*[ \\.\\(-]([12][01289][0-9][0-9])[ \\.\\)-].*", "\\1", subtext) #
>> return 1010
>>
>> I would like exclude the case 1010 and other like this.
>>
>> The solution would be:
>>
>> 18[0-9][0-9] or 19[0-9][0-9] or 200[0-9] or 201[0-9]
>>
>> Is there a solution to write such a pattern in grep ?
>>
>> Thanks a lot
>>
>> Marc
>>
>> ______________________________________________
>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posti
>> ng-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>

	[[alternative HTML version deleted]]




More information about the R-help mailing list