[R] Date_Time detected as Duplicated (but they are not!)

Agustin Lobo Agustin.Lobo at ictja.csic.es
Wed May 18 07:53:04 CEST 2011


I have a problem with duplicated date_time stamps that I do not see as
duplicated.

I read a file with observations taken every 30 minutes:

> aur2009=read.csv(paste(datadir,"AUR_ECPP_2009.csv",sep="/"),sep=";",stringsAsFactors=F)
> aur2009[1:3,1:5]
      Date.Time E_filled E_filled_flag LE_filled LE_filled_flag
1 1/1/2009 0:00        0           NaN      5.86            NaN
2 1/1/2009 0:30        0           NaN      5.05            NaN
3 1/1/2009 1:00        0           NaN      5.56            NaN

> delme = strptime(aur2009[,1], "%m/%d/%Y %H:%M")
> aur2009[,1]=as.POSIXct(delme)
            Date.Time E_filled E_filled_flag LE_filled LE_filled_flag
1 2009-01-01 00:00:00        0           NaN      5.86            NaN
2 2009-01-01 00:30:00        0           NaN      5.05            NaN
3 2009-01-01 01:00:00        0           NaN      5.56            NaN

> aur2009ts = ts(aur2009)
> row.names(aur2009ts) = as.character(delme)
> aur2009ts[1:3,1:5]
                     Date.Time E_filled E_filled_flag LE_filled LE_filled_flag
2009-01-01 00:00:00 1230764400        0           NaN      5.86            NaN
2009-01-01 00:30:00 1230766200        0           NaN      5.05            NaN
2009-01-01 01:00:00 1230768000        0           NaN      5.56            NaN

Then:
> aur2009z = zoo(aur2009[,2:12],as.POSIXct(delme))
Warning message:
In zoo(aur2009[, 2:12], as.POSIXct(delme)) :
  some methods for “zoo” objects do not work if the index entries in
‘order.by’ are not unique

So I investigate:
> any(duplicated(aur2009ts[,1]))
[1] TRUE

> aur2009ts[(duplicated(aur2009ts[,1])),1:5]
                     Date.Time E_filled E_filled_flag LE_filled LE_filled_flag
2009-03-29 02:00:00 1238284800        0           NaN       1.2            NaN
2009-03-29 02:30:00 1238286600        0           NaN       1.2            NaN

But note the surprise:
> aur2009ts[aur2009ts[,1]==1238284800,1:5]
                     Date.Time E_filled E_filled_flag LE_filled LE_filled_flag
2009-03-29 01:00:00 1238284800        0           NaN     -0.58            NaN
2009-03-29 02:00:00 1238284800        0           NaN      1.20            NaN
> aur2009ts[aur2009ts[,1]==1238286600,1:5]
                     Date.Time E_filled E_filled_flag LE_filled LE_filled_flag
2009-03-29 01:30:00 1238286600        0           NaN     -0.34            NaN
2009-03-29 02:30:00 1238286600        0           NaN      1.20            NaN

The dates detected as duplicated are actually different times that got
the same value in the ts version of the object!
What am I doing wrong? They are all observations every 30min, why are
these 2 encoded as the
same time?

Any help appreciated

Agus



More information about the R-help mailing list