[R] Vectors of years, months, and days to dates?

Roger Bivand Roger.Bivand at nhh.no
Mon Jun 7 21:00:13 CEST 2004


On Mon, 7 Jun 2004, Shin, Daehyok wrote:

> Is what I asked such an exceptional case?
> A lot of data I am dealing with (usually hydrologic data) are recorded with
> three columns of years, months, and days.
> In my opinion, the direct conversion from numeric vectors of years, months
> and days into some internal representation of date
> is widely supported in most matrix oriented languages (ex. datenum() in
> MATLAB).
> Am I really asking an odd operation for date conversion?
> 
> For performance, you are right. The loss of performance is negligible when
> users call it directly.
> But, what happens when it is called in an iterative loop?
> How can we assume some functions are only called directly by users in an
> interactive shell?

Please read the code inside the functions we have been discussing. You 
will see that there is plenty of error checking, which is needed. In 
addition, paste() is vectorised. Specifically:

> years <- sample(1900:2000, 100000, replace=TRUE)
> months <- sample(1:12, 100000, replace=TRUE)
> days <- sample(1:28, 100000, replace=TRUE)
> system.time(x <- as.Date(as.POSIXlt(paste(years, months, days, sep="-"))))
[1] 5.91 0.02 5.98 0.00 0.00
> f <- function(y, m, d) { # taken from Peter Dalgaard's reply
+ x <- as.POSIXlt(structure(rep(0,length(y)),class="POSIXct"))
+ x$year <- y-1900
+ x$mon <- m-1
+ x$mday <- d
+ as.Date(as.POSIXct(x))
+ }
> system.time(y <- f(years, months, days))
[1] 2.76 0.01 2.78 0.00 0.00
> identical(x, y)
[1] TRUE

So for 100000 dates, the integer insertion is only about twice as fast, 
but it took much longer than the difference to find that out.

> 
> Please consider positively this kind of simple interface for as.Date.
> I am quite sure this operation will be used widely once implemented.
> 
> as.Date(c(years, months, days))

I have a feeling that won't do what you want, really. Looking at 
as.POSIXlt, and as.Date, you could create a variant recognising an integer 
matrix with your special class as having years in column 1, etc., but I 
think that your implementation and support costs will outbalance the 
saving you think you are making. 

The most enjoyable fortunes package has a fitting opinion:

> fortune("Fox")

I think that it's generally a good idea not to resist the most natural way of
programming in R.
   -- John Fox
      R-help (March 2004)
> 
> Is there no one supporting my idea?

Well, you are, so that's a start - contribute an "as.Date.MatrixOfDates" 
method to dispatch on a "MatrixOfDates" class, and you may find others?

> 
> Daehyok Shin (Peter)
> Terrestrial Hydrological Ecosystem Modellers
> Geography Department
> University of North Carolina-Chapel Hill
> sdhyok at email.unc.edu
> 
> "We can do no great things,
> only small things with great love."
>                          - Mother Teresa
> 
> > -----Original Message-----
> > From: Roger Bivand [mailto:Roger.Bivand at nhh.no]
> > Sent: Monday, June 07, 2004 PM 1:02
> > To: Shin, Daehyok
> > Cc: R, Help
> > Subject: RE: [R] Vectors of years, months, and days to dates?
> >
> >
> > On Mon, 7 Jun 2004, Shin, Daehyok wrote:
> >
> > > > > res <- as.POSIXlt(paste(years, months, days, sep="-"))
> > >
> > > This command still convert numbers to a character vector, right?
> >
> > Yes, as Prof. Ripley said, the overhead of conversion to character and
> > using carefully crafted R functions is much less, for reasonable numbers
> > of dates, than inserting all the appropriate values into the POSIXlt class
> > object, since you need, in addition to years (since 1900), checked
> > month-days (31 February?), and months, week-days, year days, daylight
> > savings time flag, and seconds, minutes and hours.
> >
> > I would also be interested to know whether you can demonstrate
> > (system.time()) that anything reliable is faster than as.POSIXlt() for
> > fewer than millions of dates (and would you trust it on 29 February even
> > if it was faster?). Using base classes and functions is in general much
> > more robust than rolling your own, simply because many more people use
> > them - they have much more data run through them, and the time you might
> > save trying to avoid integer to character conversion will/should be eaten
> > up by debugging.
> >
> > >
> > > Daehyok Shin (Peter)
> > > Terrestrial Hydrological Ecosystem Modellers
> > > Geography Department
> > > University of North Carolina-Chapel Hill
> > > sdhyok at email.unc.edu
> > >
> > > "We can do no great things,
> > > only small things with great love."
> > >                          - Mother Teresa
> > >
> > > > -----Original Message-----
> > > > From: Roger Bivand [mailto:Roger.Bivand at nhh.no]
> > > > Sent: Monday, June 07, 2004 PM 12:20
> > > > To: Daehyok Shin
> > > > Cc: R, Help
> > > > Subject: Re: [R] Vectors of years, months, and days to dates?
> > > >
> > > >
> > > > On Mon, 7 Jun 2004, Daehyok Shin wrote:
> > > >
> > > > > How can I create POSIXlt directly from the numbers?
> > > > > I failed to find the solution from help documents.
> > > >
> > > > ?POSIXlt
> > > >
> > > > ?as.POSIXlt:
> > > >
> > > > > res <- as.POSIXlt(paste(years, months, days, sep="-"))
> > > > > str(res)
> > > > `POSIXlt', format: chr [1:2] "1991-01-01" "1992-10-02"
> > > > > res$year
> > > > [1] 91 92
> > > >
> > > > >
> > > > > Daehyok
> > > > >
> > > > > --On Monday, June 07, 2004 4:44 PM +0100 Prof Brian Ripley
> > > > > <ripley at stats.ox.ac.uk> wrote:
> > > > >
> > > > > > On Mon, 7 Jun 2004, Shin, Daehyok wrote:
> > > > > >
> > > > > >> The interface for dates in R is a little confusing to me.
> > > > > >> I want to create a vector of Date objects from vectors of
> > > > years, months,
> > > > > >> and days.
> > > > > >> One solution I found is:
> > > > > >>
> > > > > >> years <- c(1991, 1992)
> > > > > >> months <- c(1, 10)
> > > > > >> days <- c(1, 2)
> > > > > >>
> > > > > >> dates <- as.Date(ISOdate(years, months, days))
> > > > > >>
> > > > > >> But, in this solution the ISOdate function converts the
> > vectors into
> > > > > >> characters,
> > > > > >> which can cause serious performance and memory loss
> > > > > >> when the vectors of years, months, and days are huge.
> > > > > >
> > > > > > Really?  You have measured the loss?  A million causes no
> > problem for
> > > > > > example, and what are you going to do with a million dates that is
> > > > > > instantaneous and worthwhile?  And a million dates are hardly
> > > > going to be
> > > > > > unique so you only need to convert the unique values.
> > > > > >
> > > > > >> I am quite sure there is much better solution for it. What is it?
> > > > > >
> > > > > > Write your own C code, or make a POSIXlt object directly from
> > > > the numbers
> > > > > > and convert that.
> > > > > >
> > > > > > --
> > > > > > Brian D. Ripley,                  ripley at stats.ox.ac.uk
> > > > > > Professor of Applied Statistics,
> > http://www.stats.ox.ac.uk/~ripley/
> > > > > > University of Oxford,             Tel:  +44 1865 272861 (self)
> > > > > > 1 South Parks Road,                     +44 1865 272866 (PA)
> > > > > > Oxford OX1 3TG, UK                Fax:  +44 1865 272595
> > > > > >
> > > > >
> > > > > ______________________________________________
> > > > > R-help at stat.math.ethz.ch mailing list
> > > > > https://www.stat.math.ethz.ch/mailman/listinfo/r-help
> > > > > PLEASE do read the posting guide!
> > > > http://www.R-project.org/posting-guide.html
> > > > >
> > > >
> > > > --
> > > > Roger Bivand
> > > > Economic Geography Section, Department of Economics,
> > Norwegian School of
> > > > Economics and Business Administration, Breiviksveien 40,
> > N-5045 Bergen,
> > > > Norway. voice: +47 55 95 93 55; fax +47 55 95 93 93
> > > > e-mail: Roger.Bivand at nhh.no
> > > >
> > > >
> > > >
> > >
> >
> > --
> > Roger Bivand
> > Economic Geography Section, Department of Economics, Norwegian School of
> > Economics and Business Administration, Breiviksveien 40, N-5045 Bergen,
> > Norway. voice: +47 55 95 93 55; fax +47 55 95 93 93
> > e-mail: Roger.Bivand at nhh.no
> >
> >
> >
> 
> 

-- 
Roger Bivand
Economic Geography Section, Department of Economics, Norwegian School of
Economics and Business Administration, Breiviksveien 40, N-5045 Bergen,
Norway. voice: +47 55 95 93 55; fax +47 55 95 93 93
e-mail: Roger.Bivand at nhh.no




More information about the R-help mailing list