[R] Problem with very old dates

Marc Schwartz MSchwartz at mn.rr.com
Sun Oct 29 17:58:45 CET 2006


On Sun, 2006-10-29 at 10:31 -0600, tom soyer wrote:
> Hi,
> 
> I noticed that as.Date() could not convert date string to date type if the
> dates are very old. For example, if the date string is "1-Mar-50", then
> as.Date() would convert this to "2050-03-01", NOT "1950-03-01". This seems
> to be the behavior of as.Date() for dates older than 1969-1-1, and it is not
> documented in the R as.Date() documentation. It seems very strange that R
> would fail to convert old dates correctly. Does anyone know if this is the
> correct behavior? If so, then which method should one use to convert old
> dates?
> 
> Thanks,
> 
> Tom
> 
> P.S., I am using R 2.4.0 for Windows.

This is covered in ?strftime, which is also noted in the "See Also"
for ?as.Date, where it says:

"Your system's help pages on strftime and strptime to see how to specify
their formats."

In this case, the former help page in R indicates:

%y
        Year without century (00–99). If you use this on input, which
        century you get is system-specific. So don't! Often values up to
        69 (or 68) are prefixed by 20 and 70(or 69) to 99 by 19.
        
        
Thus on FC5 Linux, I get:

> as.Date("1-Mar-50", format = "%d-%b-%y")
[1] "2050-03-01"


Ideally, you should change the representation of the Year component of
the dates you are working with to show a full four digit year and then
use (note %Y (capital 'Y') instead of %y):

> as.Date("1-Mar-1950", format = "%d-%b-%Y")
[1] "1950-03-01"

If this data was exported from another data source (ie. Excel) change
the format in that program prior to exporting.

Otherwise, you could do something like this in R using sub():

> sub("-([0-9]+)$", "-19\\1", "1-Mar-50")
[1] "1-Mar-1950"

Which will change the two digit year ('50') to a four digit year
('1950').  See ?sub and ?regexp for more information.

HTH,

Marc Schwartz



More information about the R-help mailing list