[R] Strptime/ date time classes

Caroline Keef caroline.keef at jbaconsulting.co.uk
Wed Jul 9 19:18:55 CEST 2008


Thank you, but why does this happen?

a =(1:223960)[is.na(datetimes)]
datetimes[a]
>  [1] "1981-03-29 01:20:00" "1990-03-25 01:43:00" "1992-03-29 01:43:00"

> "1996-03-31 01:30:00" "1996-03-31 01:57:00"  [6] "1997-03-30 01:02:00"

> "1997-03-30 01:14:00" "1997-03-30 01:27:00" "1997-03-30 01:44:00" 
> "1997-03-30 01:55:00" [11] "1998-03-29 01:16:00" "1998-03-29 01:41:00"

> "1998-03-29 01:56:00" "1999-03-28 01:03:00" "1999-03-28 01:18:00" [16]

> "2000-03-26 01:28:00"

Which obviously aren't missing.

I do want POSIXlt as I need to extract the day of the month (I'm
extracting daily maxima from irregulrly observed time series).

This seems like a bug to me, I just thought I'd check with people who
know more than I do.

Caroline


-----Original Message-----
From: jim holtman [mailto:jholtman at gmail.com] 
Sent: 09 July 2008 17:24
To: Caroline Keef
Cc: r-help at r-project.org
Subject: Re: [R] Strptime/ date time classes


You probably want POSIXct instead of POSIXlt:

 x <-
read.table(textConnection("#TZUTC+0|*|SANR08002|*|SNAMENAUL|*|SWATERDELV
IN|*|CNR98808|*|
+ #CNAMEQ|*|CTYPEn-min-ip|*|CMW1440|*|RTIMELVLhigh-resolution|*|
+ #CUNITm3/s|*|RINVAL-777|*|RNR-1|*|REXCHANGE98913|*|
+ #RTYPEinstantaneous values|*|
+ 19800604062759 -777.0
+ 19800604062800 0.271
+ 19800604111900 0.286
+ 19800604134300 0.362
+ 19800604144400 0.465
+ 19800604163300 0.510
+ 19800604175400 0.518
+ 19800604185100 0.526
+ 19800611110900 -777.0
+ 19800611110959 -777.0
+ 19800611111000 0.100
+ 19800611211400 0.096
+ 19800612000000 0.096
+ 19800612065000 0.098
+ 19800612133400 0.100"),colClasses=c('character','numeric'))
> closeAllConnections()
> # you probably want POSIXct not POSIXlt
> datetimes <- as.POSIXct(strptime(x[,1], "%Y%m%d%H%M%S"))
> str(datetimes)
 POSIXct[1:15], format: "1980-06-04 06:27:59" "1980-06-04 06:28:00"
"1980-06-04 11:19:00" ...
> length(datetimes)
[1] 15
>


On Wed, Jul 9, 2008 at 6:09 AM, Caroline Keef
<caroline.keef at jbaconsulting.co.uk> wrote:
> Dear all,
>
> I've come across a problem using strptime, can anyone explain what's 
> going on?  I'm using version 2.7.0 on Windows XP.
>
> Thank you
>
> Caroline
>
> First read in a data file using read.table
>
> alldata = read.table(file, header=F, skip=4, colClasses =
> c("character","numeric"))
>
> dim(alldata)
> [1] 223960      2
>
> # inefficient, safe way of sorting out missing or dodgy data
>
> alldata[,2][alldata[,2] < 0] = NA
>
> # first ten lines of the data
>
>  alldata[1:10,]
>               V1    V2
> 1  19800604062759    NA
> 2  19800604062800 0.271
> 3  19800604111900 0.286
> 4  19800604134300 0.362
> 5  19800604144400 0.465
> 6  19800604163300 0.510
> 7  19800604175400 0.518
> 8  19800604185100 0.526
> 9  19800611110900    NA
> 10 19800611110959    NA
>
> #Then convert the first column using strptime
>
> datetimes = strptime(alldata[,1],format="%Y%m%d%H%M%S")
>
> #Then I want to get minimum and maximum, but some seem to be missing 
> when they aren't.
>
> length(as.POSIXlt(datetimes))  #also equal to length(datetimes)
>
> [1] 9
>
> # Why isn't this 223960?  Is it something to do with the class?
>
> # This is the really puzzling bit (to me anyway)
>
> a =(1:223960)[is.na(datetimes)]
>
> # which gives
> 1462  14295  18744  50499  50500  92472  92473  92474  92475  92476 
> 137525 137526 137527 171066 171067 192353
>
> # 16 values
>
>  alldata[a,]
>                   V1    V2
> 1462   19810329012000 0.983
> 14295  19900325014300 0.219
> 18744  19920329014300 0.246
> 50499  19960331013000 0.564
> 50500  19960331015700 0.563
> 92472  19970330010200 0.173
> 92473  19970330011400 0.172
> 92474  19970330012700 0.172
> 92475  19970330014400 0.172
> 92476  19970330015500 0.172
> 137525 19980329011600 0.427
> 137526 19980329014100 0.427
> 137527 19980329015600 0.427
> 171066 19990328010300 0.223
> 171067 19990328011800 0.223
> 192353 20000326012800 0.189
>
>  datetimes[a]
>  [1] "1981-03-29 01:20:00" "1990-03-25 01:43:00" "1992-03-29 01:43:00"

> "1996-03-31 01:30:00" "1996-03-31 01:57:00"  [6] "1997-03-30 01:02:00"

> "1997-03-30 01:14:00" "1997-03-30 01:27:00" "1997-03-30 01:44:00" 
> "1997-03-30 01:55:00" [11] "1998-03-29 01:16:00" "1998-03-29 01:41:00"

> "1998-03-29 01:56:00" "1999-03-28 01:03:00" "1999-03-28 01:18:00" [16]

> "2000-03-26 01:28:00"
>
> # They're all around the end of March!  I've looked at the data file 
> and I can't see anything funny in it around these dates.
>
>
>
> The first few lines of the data file look like
>
> #TZUTC+0|*|SANR08002|*|SNAMENAUL|*|SWATERDELVIN|*|CNR98808|*|
> #CNAMEQ|*|CTYPEn-min-ip|*|CMW1440|*|RTIMELVLhigh-resolution|*|
> #CUNITm3/s|*|RINVAL-777|*|RNR-1|*|REXCHANGE98913|*|
> #RTYPEinstantaneous values|*|
> 19800604062759 -777.0
> 19800604062800 0.271
> 19800604111900 0.286
> 19800604134300 0.362
> 19800604144400 0.465
> 19800604163300 0.510
> 19800604175400 0.518
> 19800604185100 0.526
> 19800611110900 -777.0
> 19800611110959 -777.0
> 19800611111000 0.100
> 19800611211400 0.096
> 19800612000000 0.096
> 19800612065000 0.098
> 19800612133400 0.100
>
>
>
>
>
> Caroline KeefJBA Consulting
> South Barn, Broughton Hall, Skipton, North Yorkshire, BD23 3AE, UK
> t: +44 (0)1756 799919  f: +44 (0)1756 799449
>
> JBA Consulting now incorporates Maslen Environmental, the award 
> winning environmental regeneration consultancy. 
> http://www.maslen-environmental.com.
>
> JBA is a Carbon Neutral Company. Please don't print this e-mail unless

> you really need to.
>
> This email is covered by JBA Consulting's email disclaimer at 
> www.jbaconsulting.co.uk/emaildisclaimer.
>
> ______________________________________________
> R-help at r-project.org mailing list 
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?



More information about the R-help mailing list