[R] regexpr: R takes very long with non-existent pattern

Leonard Mada |eo@m@d@ @end|ng |rom @yon|c@eu
Thu May 19 02:39:47 CEST 2022


Dear Bert,


The variable patt does not exist in the R environment.


I was pasting the code for an R function in the R console and I had a 
syntax error on a line. But the next lines executed simply as simple R 
code. The variable patt was not previously defined.


Though x was a different object and the long execution time may 
originate there.

x = original xml with the Pubmed abstracts


Sincerely,


Leonard



On 5/19/2022 3:31 AM, Bert Gunter wrote:
> Doubt that I can help, but what does "not defined" mean? -- NA, "", " 
> " ? Something else?
> I would guess that if it's NA, you should get an immediate error.
> If it's "" , that's a legitimate pattern and would result in matches 
> of 0 length for everything, which might trigger an error in other 
> parts of your code.
> All a guess, though.
>
> Bert Gunter
>
> "The trouble with having an open mind is that people keep coming along 
> and sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>
>
> On Wed, May 18, 2022 at 5:08 PM Leonard Mada via R-help 
> <r-help using r-project.org> wrote:
>
>     Dear R Users,
>
>
>     I have run the following command in R:
>
>     # x = larger vector of strings (1200 Pubmed abstracts);
>     # patt = not defined;
>     npos = regexpr(patt, x, perl=TRUE);
>     # Error in regexpr(patt, x, perl = TRUE) : object 'patt' not found
>
>
>     The problem:
>
>     R becomes unresponsive and it takes 1-2 minutes to return the
>     error. The
>     operation completes almost instantaneously with a valid pattern.
>
>     Is there a reason for this behavior?
>
>     Tested with R 4.2.0 on MS Windows 10.
>
>
>     I have uploaded a set with 1200 Pubmed abstracts on Github, if anyone
>     wants to check:
>
>     - see file: Example_Abstracts_Title_Pubmed.csv;
>
>     https://github.com/discoleo/R/tree/master/TextMining/Pubmed
>
>     The variable patt was not defined due to an error: but it took
>     very long
>     to exit the operation and report the error.
>
>
>     Many thanks,
>
>
>     Leonard
>
>     ______________________________________________
>     R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>     https://stat.ethz.ch/mailman/listinfo/r-help
>     PLEASE do read the posting guide
>     http://www.R-project.org/posting-guide.html
>     <http://www.R-project.org/posting-guide.html>
>     and provide commented, minimal, self-contained, reproducible code.
>
	[[alternative HTML version deleted]]



More information about the R-help mailing list