[R] applying strsplit to a whole column

Petr PIKAL petr.pikal at precheza.cz
Thu Aug 5 09:16:06 CEST 2010


Hi

r-help-bounces at r-project.org napsal dne 04.08.2010 21:03:10:

> I am sorry, someone said that strsplit automatically works on a
> column. How exactly does it work?
> For example, if I want to grab just the first (or the second) part of
> the string in nam1 that should be split based on ".."
> x<-data.frame(nam1=c("bbb..aba","ccc..abb","ddd..abc","eee..abd"),
> stringsAsFactors=FALSE)
> str(x)
> strsplit(x[[1]],split="\\..")
> str(strsplit(x[[1]],split="\\.."))
> 
> I am getting a list - hence, it looks like I have to go in a loop...?

Not necessarily, e.g. 

sapply(strsplit(as.vector(x[,1]),split="\\.."), unlist)

Regards
Petr

> 
> Thank you!
> Dimitri
> 
> 
> On Wed, Aug 4, 2010 at 2:39 PM, Dimitri Liakhovitski
> <dimitri.liakhovitski at gmail.com> wrote:
> > Thank you very much, everyone!
> > Dimitri
> >
> > On Wed, Aug 4, 2010 at 2:10 PM, David Winsemius 
<dwinsemius at comcast.net> wrote:
> >>
> >> On Aug 4, 2010, at 1:42 PM, Dimitri Liakhovitski wrote:
> >>
> >>> I am sorry, I'd like to split my column ("names") such that all the
> >>> beginning of a string ("X..") is gone and only the rest of the text 
is
> >>> left.
> >>
> >> I could not tell whether it was the string "X.." or the pattern "X.." 
that
> >> was your goal for matching and removal.
> >>>
> >>> x<-data.frame(names=c("X..aba","X..abb","X..abc","X..abd"))
> >>> x$names<-as.character(x$names)
> >>
> >> a) Instead of "names" which is heavily used function name, use 
something
> >> more specific. Otherwise you get:
> >>> names(x)
> >> "names"  # and thereby avoid list comments about canines.
> >>
> >> b) Instead of coercing a character vector back to a character vector, 
use
> >> stringsAsFactors = FALSE.
> >>
> >>> x<-data.frame(nam1=c("X..aba","X..abb","X..abc","X..abd"),
> >>> stringsAsFactors=FALSE)
> >> #Thus is the pattern version:
> >>
> >>> x$nam1 <- gsub("X..",'', x$nam1)
> >>> x
> >>  nam1
> >> 1   aba
> >> 2   abb
> >> 3   abc
> >> 4   abd
> >>
> >> This is the string version:
> >>> x<-data.frame(nam1=c("X......aba","X.y.abb","X..abc","X..abd"),
> >>> stringsAsFactors=FALSE)
> >>>  x$nam1 <- gsub("X\\.+",'', x$nam1)
> >>> x
> >>   nam1
> >> 1   aba
> >> 2 y.abb
> >> 3   abc
> >> 4   abd
> >>
> >>
> >>> (x)
> >>> str(x)
> >>>
> >>> Can't figure out how to apply strsplit in this situation - without
> >>> using a loop. I hope it's possible to do it without a loop - is it?
> >>
> >> --
> >>
> >> David Winsemius, MD
> >> West Hartford, CT
> >>
> >>
> >
> >
> >
> > --
> > Dimitri Liakhovitski
> > Ninah Consulting
> > www.ninah.com
> >
> 
> 
> 
> -- 
> Dimitri Liakhovitski
> Ninah Consulting
> www.ninah.com
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list