[R] Formatting durations

Gabor Grothendieck ggrothendieck at gmail.com
Wed Oct 27 01:17:44 CEST 2010


On Tue, Oct 26, 2010 at 3:28 PM, Susanta Mohapatra
<mohapatra.susanta at gmail.com> wrote:
> Hi,
>
> I am working with a dataset for sometime and I need some help in parsing
> some data.
>
> There is a column called "Duration" which has data like following:
>
> 2 minutes => 120
> 2 min => 120
> 10 seconds =>10
> 2 hrs =>7200
>  2-3 minutes => 150 or 120
> 5 minutes (when i arrived => 300
> Flyby approx 20 sec. => 20
> felt like 10 mins but tim => 600
>
> I need to convert them to numerics as given. Any help in this regard will be
> highly appreciated.

Assuming that "convert to numerics as given" means creating a list of
numeric vectors, one per row.

# sample input
x <- c("2 minutes => 120", "2 min => 120", "10 seconds =>10", "2 hrs =>7200",
" 2-3 minutes => 150 or 120", "5 minutes (when i arrived => 300",
"Flyby approx 20 sec. => 20", "felt like 10 mins but tim => 600")

library(gsubfn)
out <- strapply(x, "\\d+", as.numeric)

The result looks like this:

> str(out)
List of 8
 $ : num [1:2] 2 120
 $ : num [1:2] 2 120
 $ : num [1:2] 10 10
 $ : num [1:2] 2 7200
 $ : num [1:4] 2 3 150 120
 $ : num [1:2] 5 300
 $ : num [1:2] 20 20
 $ : num [1:2] 10 600


-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com



More information about the R-help mailing list