[R] apply loop - using/providing a data frame to loop over

Gabor Grothendieck ggrothendieck at gmail.com
Mon Dec 28 18:33:31 CET 2009


Try this. It picks out each string of word characters (\w+) followed
by a space followed by a word character:

> library(gsubfn)
> strapply(authors, "\\w+ \\w", c)[[1]]
[1] "Schleyer T"      "Spallek H"       "Butler B"        "Subramanian S"
[5] "Weiss D"         "Poythress M"     "Rattanathikun P" "Mueller G"

You might need to adjust the regular expression slightly depending on
what the general case is.  See http://gsubfn.googlecode.com for more.

On Mon, Dec 28, 2009 at 7:46 AM, Daniel Malter <dmalter at gmx.net> wrote:
> Hi,
>
> I want to extract individual names from a single string that contains all
> names. My problem is not the extraction itself, but the looping over the
> extraction start and end points, which I try to realize with apply.
>
> #Say, I have a string with names.
> authors=c("Schleyer T, Spallek H, Butler BS, Subramanian S, Weiss D,
> Poythress ML, Rattanathikun P, Mueller G")
>
> #Since I only want the surname and the initial of the first name, I create
> respective indices
> starts=c(1, 13, 24, 35, 50, 59, 73, 90)
> ends=c(10, 21, 31, 47, 56, 69, 87, 98)
>
> #Now I can extract the names, e.g. the third one, with
> substr(authors,start=starts[3],stop=ends[3])
>
> #So far so good, but I want to loop over all indices using apply
> #For that I wrote a function g, that takes "a" as the author string, and
> "data" as the start and end points for extraction
> g=function(a,data){substr(a,data[,1],data[,2])}
>
> #If provided with a specific row of the data frame, g works
> g(authors,data.frame(starts,ends)[3,])
>
> #If I try to loop g through the rows of the starts/ends data frame, it does
> not work.
> apply(data.frame(starts,ends),1,g,a=authors)
>
> #Interestingly, if the data frame to loop over is just a vector, it works
> also (e.g. for extracting just the first initial)
> g=function(e,a){substr(a,e,e)}
> apply(data.frame(ends),1,g,a=authors)
>
> So the problem probably lies in correctly supplying "apply" with the data
> frame. I would greatly appreciate your help.
>
> Daniel
>
> -----------------------------------------------
> "Who has visions should see a doctor,"
> Helmut Schmidt, German Chancellor (1974-1982).
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>




More information about the R-help mailing list