[R] Split String in regex while Keeping Delimiter

David Winsemius dw|n@em|u@ @end|ng |rom comc@@t@net
Thu Apr 13 00:03:29 CEST 2023


I thought replacing the spaces following instances of +++,++,+,- with "\n" and then reading with scan should succeed. Like Ivan Krylov I was fairly sure that you meant the minus sign to be "-" rather than "–", but perhaps your were using MS Word as an editor which is inconsistent with effective use of R. If so, learn to use a proper programming editor, and in any case learn to post to rhelp in plain text.

-- 
David

scan(text=gsub("([-+]){1}\\s", "\\1\n", dat), what="", sep="\n")



> On Apr 12, 2023, at 2:29 AM, Emily Bakker <emilybakker using outlook.com> wrote:
> 
> Hello List,
>  
> I have a dataset consisting of strings that I want to split while saving the delimiter.
>  
> Some example data:
> “leucocyten + gramnegatieve staven +++ grampositieve staven ++”
> “leucocyten – grampositieve coccen +”
>  
> I want to split the strings such that I get the following result:
> c(“leucocyten +”,  “gramnegatieve staven +++”,  “grampositieve staven ++”)
> c(“leucocyten –“, “grampositieve coccen +”)
>  
> I have tried strsplit with a regular expression with a positive lookahead, but I am not able to achieve the results that I want.
>  
> I have tried:
> as.list(strsplit(x, split = “(?=[\\+-]{1,3}\\s)+, perl=TRUE)
>  
> Which results in:
> c(“leucocyten “, “+”,  “gramnegatieve staven “, “+”, “+”, “+”,  “grampositieve staven ++”)
> c(“leucocyten “, “–“, “grampositieve coccen +”)
>  
>  
> Is there a function or regular expression that will make this possible?
>  
> Kind regards,
> Emily 
>  
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list