[R] handling the output of strsplit

Gabor Grothendieck ggrothendieck at gmail.com
Sat Jun 21 00:58:13 CEST 2008


We construct a times object by replacing the letter h with
a : and then pasting a :00 on the end.  Then replace any occurrence
of :: with :00: .  Its now in the format that times recognizes so we can
just convert that to times and apply hours() and minutes() to get
the components:

> library(chron)
> h2 <- times(sub("::", ":00:", paste(sub("h", ":", h), "00", sep = ":")))
> hours(h2)
[1]  3  6  9 11 14 15 23
> minutes(h2)
[1] 30 30 40 25  0 55  0\

Another possibility is to use gsubfn in package gsubfn.  It matches the
string such that it captures the hour and minutes in the two backreferences
and then pastes them together with a :00 at the end.   It then replaces
:: with :00: and converts that to times.   hours() and minutes() could be used,
as before, to get the components.

> library(gsubfn)
> times(gsubfn("([^h]+)h(.*)", ~ sub("::", ":00:", paste(..., "00", sep = ":")), h, backref = -2))
[1] 03:30:00 06:30:00 09:40:00 11:25:00 14:00:00 15:55:00 23:00:00

Here is another approach using strapply in the gsubfn package.  We use the
same pattern but this time convert each component to numeric:

> times(strapply(h, "([^h]+)h(.*)", ~ as.numeric(x) / 24 + sum(as.numeric(y), na.rm = TRUE)/(24*60), backref = -2, simplify = c))
[1] 03:30:00 06:30:00 09:40:00 11:25:00 14:00:00 15:55:00 23:00:00



On Fri, Jun 20, 2008 at 6:14 PM, Denis Chabot <chabotd at globetrotter.net> wrote:
> Hi,
>
> Simple question, but I did not figure out how to find the answer on my own
> (wrong choice of keywords on my part).
>
> I have a character variable for time of day that has entries looking like
> "6h30", "7h40", "12h25", "23h", etc. For the sake of this message, say
>
> h = c("3h30",      "6h30",      "9h40",      "11h25",     "14h00",
> "15h55",  "23h")
>
> I could not figure out how to use chron to import this into times, so I
> tried to extract the hours and minutes on my own.
>
> I used strsplit and got a list:
>
> h2 = strsplit(h, "h")
>> h2
> [[1]]
> [1] "3"  "30"
>
> [[2]]
> [1] "6"  "30"
>
> [[3]]
> [1] "9"  "40"
>
> [[4]]
> [1] "11" "25"
>
> [[5]]
> [1] "14" "00"
>
> [[6]]
> [1] "15" "55"
>
> [[7]]
> [1] "23"
>
> It is where I am stuck. I would have like to extract a vector of "hours"
> from this list, and a vector of "minutes", to reconstruct a time of day.
>
> But the only command I know, unlist, makes a long vector of h, min, h, min,
> h, min.
>
> For this in particular, but lists in general, how can one extract the first
> item of each element in the list, then the second item of each element,
> etc.?
>
> Thanks in advance,
>
> Denis
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list