[R] Grabbing Specific Words from Content (basic text mining)

Gabor Grothendieck ggrothendieck at gmail.com
Mon Jan 14 11:47:16 CET 2013


 On Mon, Jan 14, 2013 at 4:30 AM, Sachinthaka Abeywardana
<sachin.abeywardana at gmail.com> wrote:
> Hi all,
>
> Suppose I have a data frame with mixed content (name age and address).
>
> a<-"Name: John Smith Age: 35 Address: 32, street, sub, something"
> b<-data.frame(a)
>
> 1. The question is I want to extract the name age and
> address separately from this data frame (containing potentially more
> people).
>
> 2. Also just incase I have to deal with it how would the syntax change if I
> had "Name" as opposed to "Name:" (without the colon).
>

Try this:


> library(gsubfn)
>
> a <- "Name: John Smith Age: 35 Address: 32, street, sub, something"
> b <- data.frame(a)
> strapplyc(as.character(b$a), "Name: (.*) Age: (.*) Address: (.*)")
[[1]]
[1] "John Smith"                 "35"
[3] "32, street, sub, something"
>
>
> a. <- "Name John Smith Age 35 Address 32, street, sub, something"
> b. <- data.frame(a.)
> strapplyc(as.character(b.$a.), "Name (.*) Age (.*) Address (.*)")
[[1]]
[1] "John Smith"                 "35"
[3] "32, street, sub, something"

--
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com



More information about the R-help mailing list