[R] Speeding up time conversion

Prof Brian Ripley ripley at stats.ox.ac.uk
Thu Sep 27 19:49:43 CEST 2012


Users forget how much is an OS service. This is OS X not R being slow.

On a recent Linux box it takes about 90s.

But at least you can easily parallelize it: see ?pvec in package 
parallel for one way to do this (and one way not to).

If the file contain a high proportion of duplicates, making a factor and 
converting the levels will help.

On 27/09/2012 18:24, Fisher Dennis wrote:
> R 2.15.1
> OS X.7.4
>
> Colleagues,
>
> I have a large dataset (27773536 records, the file is several GB) that contains a column of date / time entries in the format:
> 	"2/1/2011 13:25:01"	
> I need to convert these to numeric values (ideally in seconds; the origin [e.g., 1970-01-01] is not important).
>
> I am using:
> 	 as.numeric(strptime(DATA$DATADTM, "%m/%d/%Y %H:%M:%S"))
> It takes 21 minutes to execute this step on a dual quad-core Mac with 12 GB RAM (it is appreciably slower on other Mac's including a new i5 iMac).
>
> Are there other time formatting functions or strategies that would be faster?
>
> Sample data:
> 	TIMECOL	<- rep("2/1/2011 13:25:01", 100)
>
> Any tips would be appreciated.
>
> Dennis
>
> Dennis Fisher MD
> P < (The "P Less Than" Company)
> Phone: 1-866-PLessThan (1-866-753-7784)
> Fax: 1-866-PLessThan (1-866-753-7784)
> www.PLessThan.com
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595




More information about the R-help mailing list