[R] Split strings based on multiple patterns (plain text)

Joe Ceradini joeceradini at gmail.com
Sat Oct 15 03:55:41 CEST 2016


should be strsplit(ugly, attributes) not strplit(ugly, attributes)....

On Fri, Oct 14, 2016 at 7:53 PM, Joe Ceradini <joeceradini at gmail.com> wrote:
> Hopefully this looks better. I did not realize gmail default was html.
>
> I have a dataframe with a column that has many field smashed together.
> I need to split the strings in the column into separate columns based
> on patterns.
>
> Example of a string that needs to be split:
>
> ugly <- c("Water temp:14: F Waterbody type:Permanent Lake/Pond: Water
> pH:Unkwn: Conductivity:Unkwn: Water color: Clear: Water turbidity:
> clear: Manmade:no  Permanence:permanent:  Max water depth: <3: Primary
> substrate: Silt/Mud: Evidence of cattle grazing: none: Shoreline
> Emergent Veg(%): 1-25: Fish present: yes: Fish species: unkwn: no
> amphibians observed")
> ugly
>
> Far as I can tell, there is not a single pattern that would work for
> splitting. Splitting on ":" is close, but not quite right. Each of the
> below attributes should be in a separate column, and are present in
> the string (above) that needs to be split:
>
> attributes <- c("Water temp", "Waterbody type", "Water pH",
> "Conductivity", "Water color", "Water turbidity", "Manmade",
> "Permanence", "Max water depth", "Primary substrate", "Evidence of
> cattle grazing", "Shoreline Emergent Veg(%)", "Fish present", "Fish
> species")
>
> Conceptually, I want to use the vector of attributes to split the
> string. However, strsplit only uses the 1st value of the attributes
> object:
>
> strplit(ugly, attributes).
>
> Should I loop through the values of "attributes"?
> Is there an argument in strsplit I'm missing that will do what I want?
> Different approach altogether?
>
> Thanks! Happy Friday.
> Joe



-- 
Cooperative Fish and Wildlife Research Unit
Zoology and Physiology Dept.
University of Wyoming
JoeCeradini at gmail.com / 914.707.8506
wyocoopunit.org



More information about the R-help mailing list