[R] extracting characters from a string

arun smartpink111 at yahoo.com
Wed Jan 23 18:58:22 CET 2013

You could try this:
dat2<- as.data.frame(do.call(cbind,lapply(dat1,function(x) gsub(" $","",gsub("^ |\\w+$","",x)))),stringsAsFactors=F)

#        V1              V2         V3         V4
#1   Brown          Santos       Rome   Don Juan 
#2 Benigni                                       
#3  Arstra   Van den Hoops   lamarque       

----- Original Message -----
From: Biau David <djmbiau at yahoo.fr>
To: r help list <r-help at r-project.org>
Sent: Wednesday, January 23, 2013 12:38 PM
Subject: [R] extracting characters from a string

Dear All,

I have a data frame of vectors of publication names such as 'pub':

pub1 <- c('Brown DK, Santos R, Rome DF, Don Juan X')
pub2 <- c('Benigni D')
pub3 <- c('Arstra SD, Van den Hoops DD, lamarque D')

pub <- rbind(pub1, pub2, pub3)

I would like to construct a dataframe with only author's last name and each last name in columns and the publication in rows. Basically I want to get rid of the initials (max 2, always before a comma) and spaces surounding last name. I would like to avoid a loop.

ps: If I could have even a short explanation of the code that extract the values of the character string that would also be great!


    [[alternative HTML version deleted]]

R-help at r-project.org mailing list
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

More information about the R-help mailing list