[R] Formatting durations

Gabor Grothendieck ggrothendieck at gmail.com
Wed Oct 27 01:41:37 CEST 2010


On Tue, Oct 26, 2010 at 7:17 PM, Gabor Grothendieck
<ggrothendieck at gmail.com> wrote:
> On Tue, Oct 26, 2010 at 3:28 PM, Susanta Mohapatra
> <mohapatra.susanta at gmail.com> wrote:
>> Hi,
>>
>> I am working with a dataset for sometime and I need some help in parsing
>> some data.
>>
>> There is a column called "Duration" which has data like following:
>>
>> 2 minutes => 120
>> 2 min => 120
>> 10 seconds =>10
>> 2 hrs =>7200
>>  2-3 minutes => 150 or 120
>> 5 minutes (when i arrived => 300
>> Flyby approx 20 sec. => 20
>> felt like 10 mins but tim => 600
>>
>> I need to convert them to numerics as given. Any help in this regard will be
>> highly appreciated.
>
> Assuming that "convert to numerics as given" means creating a list of
> numeric vectors, one per row.
>

or if => was supposed to mean that that is the desired result then try this:


f <- function(n1, n2, units) {
	if (n2 == "" && substr(units, 1, 3) == "sec") n1
	else if (n2 == "" && substr(units, 1, 3) == "min") paste(60 * as.numeric(n1))
	else if (n2 == "" && substr(units, 1, 3) == "hrs") paste(3600 * as.numeric(n1))
	else if (n2 != "" && substr(units, 1, 3) == "sec") paste(n1, "or",
-as.numeric(n2))

	else if (n2 != "" && substr(units, 1, 3) == "min") paste(60 *
as.numeric(n1), "or", -60 * as.numeric(n2))
	else if (n2 != "" && substr(units, 1, 3) == "hrs") paste(3600 *
as.numeric(n1), "or", -3660 * as.numeric(n2))
	else NA
}
	

xx <- c("2 minutes ", "2 min ", "10 seconds ", "2 hrs ", " 2-3 minutes ",
"5 minutes (when i arrived ", "Flyby approx 20 sec. ",
"felt like 10 mins but tim ")

library(gsubfn)
out2 <- strapply(xx, "(\\d+)(-\\d+)? (\\S+)", f)

The output looks like this:

> str(out2)
List of 8
 $ : chr "120"
 $ : chr "120"
 $ : chr "10"
 $ : chr "7200"
 $ : chr "120 or 180"
 $ : chr "300"
 $ : chr "20"
 $ : chr "600"


-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com



More information about the R-help mailing list