[R] splitting a character field in R

Gabor Grothendieck ggrothendieck at gmail.com
Sat Oct 29 16:30:35 CEST 2005


Here is one additional solution:

read.table(textConnection(sub("abc", " ", B)), fill = TRUE)

It also works if there are more than 2 fields.     If there can
be spaces in the lines then the sub should be modified to
translate "abc" to some unique character not appearing in
the lines and sep= should be added to the read.table call.
Also as.is=TRUE can be added to the read.table call if
its desired to return character rather than factor columns
and col.name= can be added to the read.table call if it
is desired to control the naming of the returned columns.

This solution will also work with more than two fields.


On 10/28/05, Gabor Grothendieck <ggrothendieck at gmail.com> wrote:
> You could use:
>
> data.frame(First = sub("abc.*", "", B), Second = sub(".*abc", "", B))
>
> or if you want to prevent conversion to factors:
>
> data.frame(First = I(sub("abc.*", "", B)), Second = I(sub(".*abc", "", B)))
>
> On 10/28/05, ManuelPerera-Chang at fmc-ag.com
> <ManuelPerera-Chang at fmc-ag.com> wrote:
> >
> >
> >
> >
> > Hi Jim,
> >
> > Thanks for your post, I was aware of strsplit, but really could not find
> > out how i could use it.
> >
> > I tried like in your example ...
> >
> > A<-c(1,2,3)
> > B<-c("dgabcrt","fgrtabc","sabcuuu")
> > C<-strsplit(B,"abc")
> > > C
> > [[1]]
> > [1] "dg" "rt"
> >
> > [[2]]
> > [1] "fgrt"
> >
> > [[3]]
> > [1] "s"   "uuu"
> >
> > Which looks promissing, but here C is a list with three elements. But how
> > to create the two vectors I need from here, that is
> >
> > ("dg","fgrt", "s") and ("rt","","uuu")
> >
> > (or how to get access to the substrings "rt" or "uuu").
> >
> > Greetings
> >
> > Manuel
> >
> >
> >
> >
> >                      jim holtman
> >                      <jholtman at gmail.c        To:       "ManuelPerera-Chang at fmc-ag.com" <ManuelPerera-Chang at fmc-ag.com>
> >                      om>                      cc:       r-help at stat.math.ethz.ch
> >                                               Subject:  Re: [R] splitting a character field in R
> >                      28.10.2005 16:00
> >
> >
> >
> >
> >
> >
> > > x <- 'dfabcxy'
> > > strsplit(x, 'abc')
> > [[1]]
> > [1] "df" "xy"
> >
> >
> > >
> >
> >
> >
> >
> > On 10/28/05, ManuelPerera-Chang at fmc-ag.com <ManuelPerera-Chang at fmc-ag.com >
> > wrote:
> >
> >
> >
> >
> >      Dear R users,
> >
> >      I have a dataframe with one character field, and I would like to
> >      create two
> >      new fields (columns) in my dataset, by spliting the existing
> >      character
> >      field into two using an existing substring.
> >
> >      ... something that in SAS I could solve e.g. combining substr(which I
> >      am
> >      aware exist in R) and "index" for determining the position of the
> >      pattern
> >      within the string.
> >      e.g. if my dataframe is ...
> >      A     B
> >      1     dgabcrt
> >      2     fgrtabc
> >      3     sabcuuu
> >
> >      Then by splitting by substring "abc" I would get ...
> >
> >      A     B           B1    B2
> >      1     dgabcrt     dg    rt
> >      2     fgrtabc     fgrt
> >      3     sabcuuu     s     uuu
> >
> >      Do you know how to do this basic string(dataframe) manipulation in R
> >
> >      Saludos,
> >
> >      Manuel
> >
> >      ______________________________________________
> >      R-help at stat.math.ethz.ch mailing list
> >      https://stat.ethz.ch/mailman/listinfo/r-help
> >      PLEASE do read the posting guide!
> >      http://www.R-project.org/posting-guide.html
> >
> >
> >
> > --
> > Jim Holtman
> > Cincinnati, OH
> > +1 513 247 0281
> >
> > What the problem you are trying to solve?
> >
> > ______________________________________________
> > R-help at stat.math.ethz.ch mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
> >
>




More information about the R-help mailing list