[R] dates in French format

Denis Chabot chabotd at globetrotter.net
Thu Jan 31 22:25:57 CET 2008


Hi all,

The crashes I reported earlier were cause by R 2.6.1 for Mac not  
liking the OS date setting "french canada", an issue that has been  
solved (by Simon Urbanek). The crashes did not occur when the OS was  
set to use normal french formats for dates. With that setting, the  
suggestions by Prof Ripley and Gabor all worked nicely.

Now that my dates are a chron object, I do have a new problem. The  
formatting of the dates on the x axis leaves to be desired. Instead of  
having day month and year, or at the very least day and month, I only  
get month and year so that many tick labels are identical. I also get  
a warning which puzzles me.

For instance:
 > start <- chron("12/01/2007")
 > other.dates <- seq(1,60,2)
 > Date <- start + other.dates
 > plot(1:length(Date)~Date)

6 ticks appear on the x axis. The first three are labeled "12/07" and  
the other three are labeled "01/08". I also get this:

Warning messages:
1: In v[[perm[1]]] : correspondance partielle de 'm' en 'month'
2: In v[[perm[2]]] : correspondance partielle de 'y' en 'year'

so there is only partial correspondance between "m" and "month" and  
between "y" and "year". Yet "Date" here is a proper chron object, so I  
fail to see why "correspondance" is only partial.

If I do Date2 <- as.Date(Date) and use this as my x axis, the six  
labels are more usable (déc 03, déc 13, déc 23, jan 02, jan 12, jan 22).

I suppose I can plot without x labels and draw my own, but I had not  
expected it would be necessary.

 > sessionInfo()
R version 2.6.1 (2007-11-26)
i386-apple-darwin8.10.1

locale:
fr_FR.UTF-8/fr_FR.UTF-8/fr_FR.UTF-8/C/fr_FR.UTF-8/fr_FR.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] zoo_1.4-1    chron_2.3-16

loaded via a namespace (and not attached):
[1] grid_2.6.1     lattice_0.17-2 tools_2.6.1


Denis

Le 31 janv. 08 à 09:46, Denis Chabot a écrit :

> (I've put the R Mac list in cc because of the crashes I have  
> experienced trying some of the suggestions below)
>
> Hi Gabor and Prof Ripley,
>
> Le 31 janv. 08 à 02:11, Prof Brian Ripley a écrit :
>
>> The output from sessionInfo() the posting guide asked for would  
>> have been very helpful here.
>
> You are right, sorry about that:
>
>
> > library(chron)
> > sessionInfo()
> R version 2.6.1 (2007-11-26)
> i386-apple-darwin8.10.1
>
> locale:
> fr_FR.UTF-8/fr_FR.UTF-8/fr_FR.UTF-8/C/fr_FR.UTF-8/fr_FR.UTF-8
>
> attached base packages:
> [1] stats     graphics  grDevices utils     datasets  methods   base
>
> other attached packages:
> [1] chron_2.3-16
>
>
>>
>>
>> I think the problem is likely to be that these are not standard  
>> French
>> abbreviations according to my systems.
>
> I was ready to blame Excel for the use of non-standard  
> abbreviations, but I would have been wrong: it seems that "janv" is  
> a Mac OS X decision from what I can see in my system settings. I am  
> not sure what would be a bullet-proof authority on french  
> abbreviations. My dictionary was of no help, but wikipedia seems to  
> endorse Mac OS X and Windows use of "janv":
>
> <http://fr.wikipedia.org/wiki/Mois#Abr.C3.A9viations>
>
>> On Linux I get
>>
>>> format(Sys.Date(), "%d-%b-%y")
>> [1] "31-jan-08"
>>> format(Sys.Date()-50, "%d-%b-%y")
>> [1] "12-déc-07"
>>
>> and on Windows
>>
>>> format(Sys.Date(), "%d-%b-%y")
>> [1] "31-janv.-08"
>>
>>> format(Sys.Date()-50, "%d-%b-%y")
>> [1] "12-déc.-07"
>
> I tried this too:
> > format(Sys.Date(), "%d-%b-%y")
> [1] "31-jan-08"
> > format(Sys.Date()-50, "%d-%b-%y")
> [1] "12-déc-07"
>
> I am lost here: since the OS uses "janv", why did the above give  
> "jan"???
>
>>
>>
>> And yes, chron is US-centric and so only allows English names.
>>
>> Assuming you know exactly what is meant by 'French short format', I  
>> think the simplest thing to do is to set up a table by
>>
>> tr <- month.abb
>> names(tr)[1] <- c("janv")  # complete it
>>
>> x <- "9-janv-08"
>> x2 <- strsplit(x, "-")
>> x3 <- sapply(x2, function(x) {x[2] <- tr[x[2]]; paste(x,  
>> collapse="-")})
>> as.Date(x3, format = "%d-%b-%y")
>
> Thank you Prof Ripley, although I'll have to do my homework to fully  
> understand what is happening with the function you wrote.
>
> But I wonder why I cannot make this a Date object:
>
> > x <- "9-janv-08"
> > x2 <- strsplit(x, "-")
> > x3 <- sapply(x2, function(x) {x[2] <- tr[x[2]]; paste(x,  
> collapse="-")})
> > as.Date(x3, format = "%d-%b-%y")
> [1] "2008-01-09"
> > class(x3)
> [1] "character"
> > x4 <- as.Date(x3, format = "%d-%b-%y")
>
> *** caught bus error ***
> address 0x8, cause 'non-existent physical address'
>
> Traceback:
> 1: strptime(x, format)
> 2: as.Date.character(x3, format = "%d-%b-%y")
> 3: as.Date(x3, format = "%d-%b-%y")
>
> Possible actions:
> 1: abort (with core dump, if enabled)
> 2: normal R exit
> 3: exit R without saving workspace
> 4: exit R saving workspace
>
> The problem may be my system as I get this error when trying Gabor's  
> suggestions (below).
>
> Le 31 janv. 08 à 00:21, Gabor Grothendieck a écrit :
>> Suppose we have:
>>
>> dd <- c("7-déc-07", "11-déc-07", "14-déc-07", "18-déc-07", "21- 
>> déc-07",
>> "24-déc-07", "26-déc-07", "28-déc-07", "31-déc-07", "2-janv-08",
>> "4-janv-08", "7-janv-08", "9-janv-08", "11-janv-08", "14-janv-08",
>> "16-janv-08", "18-janv-08")
>>
>> Try this (where we are assuming the just released chron 2.3-17):
>>
>> library(chron)
>> Sys.setlocale("LC_ALL", "French")
>> as.chron(as.Date(dd, "%d-%b-%y"))
>>
>> # or with chron 2.3-16 last line is replaced with:
>> chron(unclass(as.Date(dd, "%d-%b-%y")))
>>
>
> > library(chron)
> > dd <- c("7-déc-07", "11-déc-07", "14-déc-07", "18-déc-07", "21- 
> déc-07",
> + "24-déc-07", "26-déc-07", "28-déc-07", "31-déc-07", "2-janv-08",
> + "4-janv-08", "7-janv-08", "9-janv-08", "11-janv-08", "14-janv-08",
> + "16-janv-08", "18-janv-08")
> > Sys.setlocale("LC_ALL", "French")
> [1] ""
> Warning message:
> In Sys.setlocale("LC_ALL", "French") :
>  la requête OS pour spécifier la localisation à "French" n'a pas pu  
> être honorée
> > chron(unclass(as.Date(dd, "%d-%b-%y")))
>
> *** caught bus error ***
> address 0x8, cause 'non-existent physical address'
>
> Traceback:
> 1: strptime(x, format)
> 2: as.Date.character(dd, "%d-%b-%y")
> 3: as.Date(dd, "%d-%b-%y")
> 4: inherits(dates., "dates")
> 5: chron(unclass(as.Date(dd, "%d-%b-%y")))
>
> Possible actions:
> 1: abort (with core dump, if enabled)
> 2: normal R exit
> 3: exit R without saving workspace
> 4: exit R saving workspace
>
>> If those don't work (the above didn't work on my Vista system but  
>> this
>> is system dependent and
>> might work on yours)  then try this alternative
>>
>>> library(chron)
>>> library(gsubfn)
>>> Sys.setlocale('LC_ALL','French')
>> [1] "LC_COLLATE=French_France.1252;LC_CTYPE=French_France. 
>> 1252;LC_MONETARY=French_France. 
>> 1252;LC_NUMERIC=C;LC_TIME=French_France.1252"
>>> french.months <- format(seq(as.Date("2000-01-01"), length = 12, by  
>>> = "month"), "%b")
>>> f <- function (d, m, y) chron(paste(pmatch(m, french.months), d,  
>>> y, sep = "/"))
>>> strapply(dd, "(.*)-(.*)-(.*)", f, backref = -3, simplify = c)
>> [1] 12/07/07 12/11/07 12/14/07 12/18/07 12/21/07 12/24/07 12/26/07  
>> 12/28/07
>> [9] 12/31/07 01/02/08 01/04/08 01/07/08 01/09/08 01/11/08 01/14/08  
>> 01/16/08
>> [17] 01/18/08
>
> Again, this Sys.setlocale call does not work for me and the use of  
> as.Date crashes my copy of R:
>
> > library(chron)
> > library(gsubfn)
> Le chargement a nécessité le package : proto
> > french.months <- format(seq(as.Date("2000-01-01"), length = 12, by  
> = "month"), "%b")
>
> *** caught bus error ***
> address 0x8, cause 'non-existent physical address'
>
> Traceback:
> 1: strptime(x, f)
> 2: fromchar(x)
> 3: as.Date.character("2000-01-01")
> 4: as.Date("2000-01-01")
> 5: seq(as.Date("2000-01-01"), length = 12, by = "month")
> 6: format(seq(as.Date("2000-01-01"), length = 12, by = "month"),      
> "%b")
>
> Possible actions:
> 1: abort (with core dump, if enabled)
> 2: normal R exit
> 3: exit R without saving workspace
> 4: exit R saving workspace
>
> However, if I replace that call by this, the rest of Gabor's  
> solution works.
>
> > library(chron)
> > library(gsubfn)
> Le chargement a nécessité le package : proto
> > french.months <- c("janv", "fév", "mars", "avr", "mai", "juin",  
> "juil", "août", "sept", "oct", "nov", "déc")
> > dd <- c("7-déc-07", "11-déc-07", "14-déc-07", "18-déc-07", "21- 
> déc-07",
> + "24-déc-07", "26-déc-07", "28-déc-07", "31-déc-07", "2-janv-08",
> + "4-janv-08", "7-janv-08", "9-janv-08", "11-janv-08", "14-janv-08",
> + "16-janv-08", "18-janv-08")
> > f <- function (d, m, y) chron(paste(pmatch(m, french.months), d,  
> y, sep = "/"))
> > strapply(dd, "(.*)-(.*)-(.*)", f, backref = -3, simplify = c)
> [1] 12/07/07 12/11/07 12/14/07 12/18/07 12/21/07 12/24/07 12/26/07  
> 12/28/07
> [9] 12/31/07 01/02/08 01/04/08 01/07/08 01/09/08 01/11/08 01/14/08  
> 01/16/08
> [17] 01/18/08
>
> So thanks again. I will try to reinstall R on my computer and see if  
> I still get these errors.
>
>
> Denis
>
>>
>>
>>
>> On Jan 30, 2008 11:29 PM, Denis Chabot <chabotd at globetrotter.net>  
>> wrote:
>>> Hello R users,
>>>
>>> I have to import a file with one column containing dates written in
>>> French short format, such as:
>>>
>>>  7-déc-07
>>> 11-déc-07
>>> 14-déc-07
>>> 18-déc-07
>>> 21-déc-07
>>> 24-déc-07
>>> 26-déc-07
>>> 28-déc-07
>>> 31-déc-07
>>> 2-janv-08
>>> 4-janv-08
>>> 7-janv-08
>>> 9-janv-08
>>> 11-janv-08
>>> 14-janv-08
>>> 16-janv-08
>>> 18-janv-08
>>>
>>> There are other columns for other (numeric) variables in the data
>>> file. In my read.csv2 statement, I indicate that the date column  
>>> must
>>> be imported "as.is" to keep it as character.
>>>
>>> I would like to transform this into a date object in R. So far I've
>>> used chron for my dates and times needs, but I am willing to  
>>> change if
>>> another object/package will ease the task of importing these dates.
>>>
>>> My reading of the chron help led me to believe that the formats it
>>> understands are only month names in English.
>>>
>>> Are there other "formats" I can use with chron, or must I somehow  
>>> edit
>>> this character variables to replace French month names by English  
>>> ones
>>> (or numbers from 1 to 12)?
>>>
>>> Thanks in advance,
>>>
>>> Denis
>>> p.s. I read this in digest mode, so I'll get your replies faster if
>>> you cc to my email
>
>
>
>
>
>
>
>
>



More information about the R-help mailing list