[R] Regex Split?

Howard, Tim G (DEC) t|m@how@rd @end|ng |rom dec@ny@gov
Fri May 5 12:50:37 CEST 2023


If you only want the character strings, this seems a little simpler:

> strsplit("a bc,def, adef ,,gh", "[ ,]+", perl=T)
[[1]]
[1] "a"    "bc"   "def"  "adef" "gh"  


If you need delimeters (the commas) you could then add them back in again afterwards. 
Tim

------------------------------

Message: 2
Date: Thu, 4 May 2023 23:59:33 +0300
From: Leonard Mada <leo.mada using syonic.eu>
To: R-help Mailing List <r-help using r-project.org>
Subject: [R] Regex Split?
Message-ID: <7b1cdbe7-0086-24b4-9da6-369296eadfdc using syonic.eu>
Content-Type: text/plain; charset="utf-8"; Format="flowed"

Dear R-Users,

I tried the following 3 Regex expressions in R 4.3:
strsplit("a bc,def, adef ,,gh", " |(?=,)|(?<=,)(?![ ])", perl=T)
# "a"    "bc"   ","    "def"  ","    ""     "adef" ","    "," "gh"

strsplit("a bc,def, adef ,,gh", " |(?<! )(?=,)|(?<=,)(?![ ])", perl=T)
# "a"    "bc"   ","    "def"  ","    ""     "adef" ","    "," "gh"

strsplit("a bc,def, adef ,,gh", " |(?<! )(?=,)|(?<=,)(?=[^ ])", perl=T)
# "a"    "bc"   ","    "def"  ","    ""     "adef" ","    "," "gh"


Is this correct?


I feel that:
- none should return (after "def"): ",", "";
- the first one could also return "", "," (but probably not; not fully
sure about this);


Sincerely,


Leonard




------------------------------




More information about the R-help mailing list