[R] extract date time from a text file

Gabor Grothendieck ggrothendieck at gmail.com
Fri Jun 18 19:17:23 CEST 2010


On Fri, Jun 18, 2010 at 10:58 AM, Sebastian Kruk
<residuo.solow at gmail.com> wrote:
> I a have a text file where every line is like that:
>
> "2007-12-03 13:50:17 Juan Perez"
> ("yy-mm-dd hh:mm:ss First Name Second Name")
>
> I would like to make a data frame with two column one for date and the
> other one for name.


Suppose this is your data:

Lines <- "2007-12-03 13:50:17 Juan Perez
2008-12-03 13:50:17 Juanita Perez"

# 1. sub
# This reads it into data frame out

L <- readLines(textConnection(Lines))
L <- sub(" ", ",", L)
L <- sub(" ", ",", L)
out <- read.table(textConnection(L), sep = ",", as.is = TRUE)

# 1a. strapply
# An alternate way to do this is with strapply in the gsubfn package

library(gsubfn)
L <- readLines(textConnection(Lines))
out <- strapply(L, "(\\S+) (\\S+) (.*)", c, simplify = rbind)


# we convert 1 or 1a to data frame with chron dates and times
library(chron)
data.frame(date = dates(as.chron(out[,1])),
	time = times(out[,2]),
	value = out[,3],
	stringsAsFactors = FALSE)

# 2. timedate
# actually the above may not be the best representation. In almost all cases you
# want to keep the date/time together and only extract the portion
# you want at the last minute during processing.

# sub

L <- readLines(textConnection(Lines))
L <- sub(" ", ",", L)
L <- sub(" ", ",", L)
L <- sub(",", " ", L)
out <- read.table(textConnection(L), sep = ",", as.is = TRUE)
library(chron)
data.frame(datetime = as.chron(out[,1]), value = out[,2],
	stringsAsFactors = FALSE)

# 3. time series
# Also is this a time series?  If it is then you would be better off
# representing it as such.

L <- readLines(textConnection(Lines))
L <- sub(" ", ",", L)
L <- sub(" ", ",", L)
L <- sub(",", " ", L)
library(zoo)
library(chron)
read.zoo(textConnection(L), sep = ",", FUN = as.chron)



More information about the R-help mailing list