[R] column of dates into time series

David Winsemius dwinsemius at comcast.net
Sun Nov 29 19:51:14 CET 2009


On Nov 29, 2009, at 12:41 PM, David Winsemius wrote:

>
> On Nov 29, 2009, at 11:59 AM, DispersionMap wrote:
>
>>
>> ? like this do you mean...
>
> Yes. Exactly. Unfortunately that data was passed though the summary  
> function which has done some odd things to the data, to wit:
>
> > Weeks[Weeks==189]
> 2007-03-26 2007-07-09 2007-11-05 2008-02-25 2008-09-08 2009-08-10
>       189        189        189        189        189        189

Actually those were not so odd. I was just interpreting them  
incorrectly. Those were the tables of counts (i.e. counts of counts),  
since summary on a factor is just the same as table on a factor, so a  
further table operation was giving me a set of weeks that shared the  
same number of counts in the original dataset (which may not be the  
best thing to work with as an example if it has sum(Weeks) = 34377  
entries):

 > wk2 <- Sys.Date() -round(150*runif(150))

 > options(width=60)
 > summary(wk2)
         Min.      1st Qu.       Median         Mean
"2009-07-02" "2009-08-10" "2009-09-20" "2009-09-18"
      3rd Qu.         Max.
"2009-10-27" "2009-11-28"
 > table(summary(cut(wk2, breaks="weeks")))

  1  2  3  4  5  6  7  9 11 12 14
  1  1  2  1  5  2  1  5  2  1  1
 > table(table(cut(wk2, breaks="weeks")))

  1  2  3  4  5  6  7  9 11 12 14
  1  1  2  1  5  2  1  5  2  1  1
 > summary(cut(wk2, breaks="weeks"))
2009-06-29 2009-07-06 2009-07-13 2009-07-20 2009-07-27
          1          9          5          4         12
2009-08-03 2009-08-10 2009-08-17 2009-08-24 2009-08-31
          6          9          5          5          5
2009-09-07 2009-09-14 2009-09-21 2009-09-28 2009-10-05
          3         14          5          2          6
2009-10-12 2009-10-19 2009-10-26 2009-11-02 2009-11-09
          7         11          9         11          9
2009-11-16 2009-11-23
          9          3
 > table(cut(wk2, breaks="weeks"))

2009-06-29 2009-07-06 2009-07-13 2009-07-20 2009-07-27
          1          9          5          4         12
2009-08-03 2009-08-10 2009-08-17 2009-08-24 2009-08-31
          6          9          5          5          5
2009-09-07 2009-09-14 2009-09-21 2009-09-28 2009-10-05
          3         14          5          2          6
2009-10-12 2009-10-19 2009-10-26 2009-11-02 2009-11-09
          7         11          9         11          9
2009-11-16 2009-11-23
          9          3

>
> Can you give us dput on data$Raised.Date? And also explain what  
> further hints you still need, now that you have been advised that  
> table and order are the functions to do the operations you requested?
>
> -- 
> David.
>
>
>
>>
>>> dput(Weeks)
>> structure(c(370L, 342L, 333L, 317L, 308L, 298L, 289L, 269L, 265L,
>> 257L, 254L, 253L, 252L, 249L, 243L, 243L, 239L, 239L, 236L, 234L,
>> 233L, 232L, 230L, 230L, 229L, 229L, 229L, 228L, 227L, 226L, 225L,
>> 222L, 218L, 217L, 216L, 215L, 215L, 214L, 214L, 214L, 212L, 211L,
>> 209L, 209L, 208L, 207L, 205L, 205L, 204L, 204L, 203L, 202L, 202L,
>> 201L, 200L, 199L, 197L, 197L, 197L, 197L, 195L, 194L, 194L, 194L,
>> 193L, 193L, 193L, 193L, 192L, 191L, 190L, 190L, 190L, 189L, 189L,
>> 189L, 189L, 189L, 189L, 188L, 188L, 187L, 187L, 186L, 183L, 182L,
>> 182L, 181L, 180L, 180L, 180L, 179L, 179L, 179L, 178L, 178L, 177L,
>> 177L, 177L, 13091L), .Names = c("2007-12-17", "2009-01-05",  
>> "2008-06-09",
>> "2008-12-08", "2009-02-09", "2008-12-01", "2008-05-12", "2009-02-16",
>> "2007-01-22", "2008-06-02", "2007-01-29", "2008-05-19", "2007-06-11",
>> "2008-06-16", "2008-05-26", "2008-06-23", "2008-11-03", "2009-01-12",
>> "2008-07-21", "2007-02-05", "2008-02-18", "2008-07-14", "2008-01-14",
>> "2008-10-27", "2007-12-10", "2008-03-17", "2008-08-04", "2008-11-24",
>> "2006-12-18", "2007-11-26", "2007-11-12", "2006-11-06", "2007-06-25",
>> "2006-04-03", "2008-01-07", "2006-04-10", "2008-07-28", "2006-05-08",
>> "2006-06-05", "2009-02-23", "2007-10-22", "2007-02-19", "2008-06-30",
>> "2009-02-02", "2007-06-04", "2007-12-03", "2006-11-13", "2007-09-03",
>> "2006-08-28", "2008-07-07", "2007-05-14", "2006-08-14", "2007-04-16",
>> "2006-07-31", "2008-12-15", "2006-09-11", "2006-06-12", "2008-01-21",
>> "2008-04-07", "2009-01-26", "2008-02-11", "2007-04-02", "2007-04-09",
>> "2008-04-21", "2006-08-07", "2007-11-19", "2008-04-14", "2008-05-05",
>> "2006-07-24", "2007-05-21", "2006-06-19", "2006-10-09", "2007-02-12",
>> "2007-03-26", "2007-07-09", "2007-11-05", "2008-02-25", "2008-09-08",
>> "2009-08-10", "2008-09-15", "2009-06-08", "2006-05-15", "2007-07-02",
>> "2009-06-01", "2008-11-17", "2006-06-26", "2009-06-29", "2007-08-06",
>> "2007-08-13", "2009-01-19", "2009-07-13", "2006-04-17", "2007-03-05",
>> "2007-12-24", "2006-10-16", "2008-08-11", "2006-05-29", "2006-11-27",
>> "2007-10-29", "(Other)"))
>>
>>
>>
>>
>> David Winsemius wrote:
>>>
>>> How about a representation of the data that one could so something
>>> with? By that I mean either by the "dump" method described in the
>>> Posting Guide or by using dput:
>>>> ttt <- c(1,2)
>>>
>>>> dput(ttt)
>>> c(1, 2)
>>>
>>>> dump("ttt", stdout() )
>>> ttt <-
>>> c(1, 2)
>>>
>>> Did you honestly expect anyone in their right mind to reassemble  
>>> that
>>> data from the console text output???
>>>
>>> On Nov 29, 2009, at 10:46 AM, DispersionMap wrote:
>>>
>>>>
>>>> Thanks again David,
>>>>
>>>> Heres what happened:
>>>>
>>>>> Weeks<-summary(cut(data$Raised.Date, breaks="weeks"))
>>>>> Weeks
>>>> 2007-12-17 2009-01-05 2008-06-09 2008-12-08 2009-02-09 2008-12-01
>>>>     370        342        333        317        308        298
>>>> 2008-05-12 2009-02-16 2007-01-22 2008-06-02 2007-01-29 2008-05-19
>>>>     289        269        265        257        254        253
>>>> 2007-06-11 2008-06-16 2008-05-26 2008-06-23 2008-11-03 2009-01-12
>>>>     252        249        243        243        239        239
>>>> 2008-07-21 2007-02-05 2008-02-18 2008-07-14 2008-01-14 2008-10-27
>>>>     236        234        233        232        230        230
>>>> 2007-12-10 2008-03-17 2008-08-04 2008-11-24 2006-12-18 2007-11-26
>>>>     229        229        229        228        227        226
>>>> 2007-11-12 2006-11-06 2007-06-25 2006-04-03 2008-01-07 2006-04-10
>>>>     225        222        218        217        216        215
>>>> 2008-07-28 2006-05-08 2006-06-05 2009-02-23 2007-10-22 2007-02-19
>>>>     215        214        214        214        212        211
>>>> 2008-06-30 2009-02-02 2007-06-04 2007-12-03 2006-11-13 2007-09-03
>>>>     209        209        208        207        205        205
>>>> 2006-08-28 2008-07-07 2007-05-14 2006-08-14 2007-04-16 2006-07-31
>>>>     204        204        203        202        202        201
>>>> 2008-12-15 2006-09-11 2006-06-12 2008-01-21 2008-04-07 2009-01-26
>>>>     200        199        197        197        197        197
>>>> 2008-02-11 2007-04-02 2007-04-09 2008-04-21 2006-08-07 2007-11-19
>>>>     195        194        194        194        193        193
>>>> 2008-04-14 2008-05-05 2006-07-24 2007-05-21 2006-06-19 2006-10-09
>>>>     193        193        192        191        190        190
>>>> 2007-02-12 2007-03-26 2007-07-09 2007-11-05 2008-02-25 2008-09-08
>>>>     190        189        189        189        189        189
>>>> 2009-08-10 2008-09-15 2009-06-08 2006-05-15 2007-07-02 2009-06-01
>>>>     189        188        188        187        187        186
>>>> 2008-11-17 2006-06-26 2009-06-29 2007-08-06 2007-08-13 2009-01-19
>>>>     183        182        182        181        180        180
>>>> 2009-07-13 2006-04-17 2007-03-05 2007-12-24 2006-10-16 2008-08-11
>>>>     180        179        179        179        178        178
>>>> 2006-05-29 2006-11-27 2007-10-29    (Other)
>>>>     177        177        177      13091
>>>>
>>>>
>>>> As you can see its come out a little disorderly though.
>>>> I have to plot the number of events under each date in a time  
>>>> series.
>>>>
>>>> How do i order the dates and their counts?
>>>
>>> Well, you don't really have dates anymore, do you? You have week
>>> numbers with labels that look like dates. So ordering them is a  
>>> piece
>>> of cake given the laws of mathematics.
>>>
>>> Weeks[order(Weeks)]  #untested... no reproducible data
>>>
>>> Aggregating them into counts can be done with  various functions,
>>> the most basic of which is table:
>>>
>>> table(Weeks)
>>>
>>> -- 
>>> David.
>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> David Winsemius wrote:
>>>>>
>>>>>
>>>>> On Nov 29, 2009, at 7:52 AM, Linlin Yan wrote:
>>>>>
>>>>>> There is no year() function. Maybe you can try format() instead.
>>>>>>
>>>>>> On Sun, Nov 29, 2009 at 8:44 PM, DispersionMap <frenchcr at btinternet.com
>>>>>>> wrote:
>>>>>>>
>>>>>>> i have a column of dates in this format:
>>>>>>>
>>>>>>> data[,"Raised.Date"] <- as.Date(data[,"Raised.Date"], "%d/%m/ 
>>>>>>> %Y");
>>>>>>> data[1:10,"Raised.Date"]
>>>>>>> [1] "2006-07-07" "2006-07-07" "2006-04-03" "2006-04-03"
>>>>>>> "2006-04-03"
>>>>>>> "2006-04-03" "2006-04-03" "2006-04-03" "2006-04-03" "2006-04-03"
>>>>>>>
>>>>>>> I can turn them into months like this...
>>>>>>>
>>>>>>> Month<-months(data[,"Raised.Date"])
>>>>>>> Month[1:10]
>>>>>>> [1] "July"  "July"  "April" "April" "April" "April" "April"  
>>>>>>> "April"
>>>>>>> "April"
>>>>>>> "April"
>>>>>>>
>>>>>>>
>>>>>>> But i also want to turn them into years (and also weeks later  
>>>>>>> on),
>>>>>>> so tried
>>>>>>> this...
>>>>>
>>>>> library(chron)
>>>>> ?cut.dates
>>>>>
>>>>> The argument breaks has several options including one of   
>>>>> c("days",
>>>>> "weeks", "months", "year")
>>>>>
>>>>>> dts <- Sys.Date() - 1:20
>>>>>
>>>>>> cut(dts, breaks="weeks")
>>>>> [1] 2009-11-23 2009-11-23 2009-11-23 2009-11-23 2009-11-23
>>>>> 2009-11-23 2009-11-16 2009-11-16 2009-11-16 2009-11-16
>>>>> [11] 2009-11-16 2009-11-16 2009-11-16 2009-11-09 2009-11-09
>>>>> 2009-11-09
>>>>> 2009-11-09 2009-11-09 2009-11-09 2009-11-09
>>>>> Levels: 2009-11-09 2009-11-16 2009-11-23
>>>>>
>>>>> I was a bit puzzled when I tried cut.dates as the function which
>>>>> throws a function not found error.
>>>>>
>>>>>
>>>>>>>
>>>>>>> Year<-year(data[,"Raised.Date"])
>>>>>>> Error: could not find function "year"
>>>>>>
>>>>>
>>>>>
>>>>> David Winsemius, MD
>>>>> Heritage Laboratories
>>>>> West Hartford, CT
>>>>>
>>>>> ______________________________________________
>>>>> R-help at r-project.org mailing list
>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>> PLEASE do read the posting guide
>>>>> http://www.R-project.org/posting-guide.html
>>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>>
>>>>>
>>>>
>>>> -- 
>>>> View this message in context:
>>>> http://n4.nabble.com/column-of-dates-into-time-series-tp930699p930744.html
>>>> Sent from the R help mailing list archive at Nabble.com.
>>>>
>>>> ______________________________________________
>>>> R-help at r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide
>>>> http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>> David Winsemius, MD
>>> Heritage Laboratories
>>> West Hartford, CT
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>>
>>
>> -- 
>> View this message in context: http://n4.nabble.com/column-of-dates-into-time-series-tp930699p930777.html
>> Sent from the R help mailing list archive at Nabble.com.
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> David Winsemius, MD
> Heritage Laboratories
> West Hartford, CT
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
Heritage Laboratories
West Hartford, CT




More information about the R-help mailing list