[R] regex pattern assistance

Marc Schwartz marc_schwartz at me.com
Fri Aug 15 18:41:42 CEST 2014


On Aug 15, 2014, at 11:18 AM, Tom Wright <tom at maladmin.com> wrote:

> Hi,
> Can anyone please assist.
> 
> given the string 
> 
>> x<-"/mnt/AO/AO Data/S01-012/120824/"
> 
> I would like to extract "S01-012"
> 
> require(stringr)
>> str_match(x,"\\/mnt\\/AO\\/AO Data\\/(.+)\\/+")
>> str_match(x,"\\/mnt\\/AO\\/AO Data\\/(\\w+)\\/+")
> 
> both nearly work. I expected I would use something like:
>> str_match(x,"\\/mnt\\/AO\\/AO Data\\/([\\w -]+)\\/+")
> 
> but I don't seem able to get the square bracket grouping to work
> correctly. Can someone please show me where I am going wrong?
> 
> Thanks,
> Tom


Is the desired substring always in the same relative position in the path?

If so:

> strsplit(x, "/")
[[1]]
[1] ""        "mnt"     "AO"      "AO Data" "S01-012" "120824" 

> unlist(strsplit(x, "/"))[5]
[1] "S01-012"



Alternatively, again, presuming the same position:

> gsub("/mnt/AO/AO Data/([^/]+)/.+", "\\1", x)
[1] "S01-012"


You don't need all of the double backslashes in your regex above. The '/' character is not a special regex character, whereas '\' is and needs to be escaped.

Regards,

Marc Schwartz



More information about the R-help mailing list