[R] trouble with character \u00e2

Prof Brian Ripley ripley at stats.ox.ac.uk
Wed Oct 8 23:02:25 CEST 2008


On Wed, 8 Oct 2008, Charles Annis, P.E. wrote:

> Thank you Professor:
>
> Here is an example using R2.8.0 beta.  It shows the coding to be "latin1"

But you did not use file.choose or basename here.

> I installed my package which requires rcom, RODBC, RColorBrewer, survival I
> was unable to find rcom in the packages drop-down menu.  I tried mirrors
> USA(PA) and USA(PA2).  rcom does appear in the menu run under R2.7.2,
> however.
>
> __________________________________________________
> R version 2.8.0 beta (2008-10-07 r46631)
> Copyright (C) 2008 The R Foundation for Statistical Computing
> ISBN 3-900051-07-0
>
> R is free software and comes with ABSOLUTELY NO WARRANTY.
> You are welcome to redistribute it under certain conditions.
> Type 'license()' or 'licence()' for distribution details.
>
>  Natural language support but running in an English locale
>
> R is a collaborative project with many contributors.
> Type 'contributors()' for more information and
> 'citation()' on how to cite R or R packages in publications.
>
> Type 'demo()' for some demos, 'help()' for on-line help, or
> 'help.start()' for an HTML browser interface to help.
> Type 'q()' to quit R.
>
>> ls()
> character(0)
>> file.label <- "EXAMPLE 1 â vs a.xls"
>> charToRaw(file.label)
> [1] 45 58 41 4d 50 4c 45 20 31 20 e2 20 76 73 20 61 2e 78 6c 73
>> Encoding(file.label)
> [1] "latin1"
>>
>
> Charles Annis, P.E.
>
> Charles.Annis at StatisticalEngineering.com
> phone: 561-352-9699
> eFax:  614-455-3265
> http://www.StatisticalEngineering.com
>
>
> -----Original Message-----
> From: Prof Brian Ripley [mailto:ripley at stats.ox.ac.uk]
> Sent: Wednesday, October 08, 2008 2:20 PM
> To: Charles Annis, P.E.
> Subject: RE: [R] trouble with character \u00e2
>
> On Wed, 8 Oct 2008, Charles Annis, P.E. wrote:
>
>> Professor Ripley:
>>
>> Can I get the Windows binaries for R2.8.0 beta?  I looked earlier today
> and
>> found the tar files but not any binaries.
>> http://cran.r-project.org/src/base-prerelease/
>
> http://cran.r-project.org/bin/windows/base/rtest.html
>
> or look via Windows.
>
>
>>
>> Thank you.
>>
>> Charles Annis, P.E.
>>
>> Charles.Annis at StatisticalEngineering.com
>> phone: 561-352-9699
>> eFax:  614-455-3265
>> http://www.StatisticalEngineering.com
>>
>>
>> -----Original Message-----
>> From: Prof Brian Ripley [mailto:ripley at stats.ox.ac.uk]
>> Sent: Wednesday, October 08, 2008 1:10 PM
>> To: Charles Annis, P.E.
>> Cc: r-help at r-project.org
>> Subject: RE: [R] trouble with character \u00e2
>>
>> Can you please try a 2.8.0 beta build?  I have a suspicion as to what
>> might be going on, and it cannot happen there.
>>
>> If my guess is correct,
>>
>> nfile <- paste("diagnostic â vs a ", file.label, ".jpg", sep = "")
>> savePlot(path.expand(nfile), type="jpg")
>>
>> may work for you in 2.7.2 (but as I said, I wasn't able to reproduce this
>> there).  The crucial bit is to use path.expand() on the final file name:
>> it will do nothing except ensure that the encoding is correct.
>>
>> On Wed, 8 Oct 2008, Charles Annis, P.E. wrote:
>>
>>> Thank you Professor:
>>>
>>> After reading in the file this is what I see:
>>>> file.label
>>> [1] "EXAMPLE 1 â vs a.xls"
>>>
>>> charToRaw(file.label)
>>> [1] 45 58 41 4d 50 4c 45 20 31 20 c3 a2 20 76 73 20 61 2e 78 6c 73
>>>
>>>> Encoding(file.label)
>>> [1] "UTF-8"
>>>
>>>> Encoding(paste("diagnostic â vs a ", file.label, ".jpg", sep = ""))
>>> [1] "UTF-8"
>>>
>>> But look what happens after I run your example:
>>>> charToRaw(file.label)
>>> [1] 45 58 41 4d 50 4c 45 20 31 20 e2 20 76 73 20 61 2e 78 6c 73
>> (after)
>>> [1] 45 58 41 4d 50 4c 45 20 31 20 c3 a2 20 76 73 20 61 2e 78 6c 73
>> (before)
>>>
>>> The file label appears on the screen as it does above both times, but
>>> clearly charToRaw() shows that the coding for â has changed from the
>>> unexpected c3 a2, to the desired e2.
>>>
>>> After running your example I now observe
>>>> Encoding(file.label)
>>> [1] "latin1"
>>>
>>> Again, thank you for your help.
>>>
>>> Charles Annis, P.E.
>>>
>>> Charles.Annis at StatisticalEngineering.com
>>> phone: 561-352-9699
>>> eFax:  614-455-3265
>>> http://www.StatisticalEngineering.com
>>>
>>>
>>> -----Original Message-----
>>> From: Prof Brian Ripley [mailto:ripley at stats.ox.ac.uk]
>>> Sent: Wednesday, October 08, 2008 10:32 AM
>>> To: Charles Annis, P.E.
>>> Cc: r-help at r-project.org
>>> Subject: RE: [R] trouble with character \u00e2
>>>
>>> That also works without a hitch on my box, even in vanilla 2.7.2.  What
>>> exactly is in file.label as given by
>>>
>>> charToRaw(file.label)
>>> Encoding(file.label)
>>>
>>> ?  It should be in UTF-8, and so should
>>>
>>> paste("diagnostic â vs a ", file.label, ".jpg", sep = "")
>>>
>>> It looks like the latter is not being treated as UTF-8 on your system
> (see
>>> what Encoding() says on its value).
>>>
>>> On Wed, 8 Oct 2008, Charles Annis, P.E. wrote:
>>>
>>>> Thank you, Professor Ripley:
>>>>
>>>> Your example works for me too.
>>>>
>>>> plot(1:10, xlab = "a", ylab = "â")
>>>> file.label <- "EXAMPLE 1 â vs a.xls"
>>>> savePlot(paste("diagnostic â vs a ", file.label, ".jpg",
>>>>          sep = ""), type = "jpg")
>>>>
>>>>
>>>> But, if I read-in the file name using file.choose() I get the same
>>> corrupted
>>>> output filename ( "diagnostic â vs a EXAMPLE 1 â vs a.xls.jpg" ) from
>> my
>>> R
>>>> routines.  However, if I paste that same file.label as it is printed to
>>> the
>>>> screen with my input routine, replacing your "foo" (as above) things
> work
>>> as
>>>> they should ( "diagnostic â vs a EXAMPLE 1 â vs a.xls.jpg" ).
>>> Furthermore,
>>>> if I again run my plotting routines after your example (like that here,
>>>> above), my routines no longer produce corrupted filenames for the saved
>>>> plots.
>>>>
>>>> The trouble seems to be caused by my how I read-in the file name.  Here
>> is
>>> a
>>>> simple example that produces a corrupted file name for the saved plot:
>>>>
>>>> plot(1:10, xlab = "a", ylab = "â")
>>>> file.name <<- file.choose()
>>>>    print(file.name)
>>>>    file.label <<- basename(file.name)
>>>> savePlot(paste("diagnostic â vs a ", file.label, ".jpg",
>>>>          sep = ""), type = "jpg")
>>>>
>>>>
>>>> The name of my input Excel file is "EXAMPLE 1 â vs a.xls"
>>>> The problem does not occur on R < R2.7.0
>>>>
>>>> I am running R2.7.2 on a 5 year old DELL box (2 Gig RAM, 3GHz Pentium 4)
>>>> with Windows XP, and have also experienced the problem on my Thinkpad
>>> laptop
>>>> (2 Gig, Intel Core2 Duo, 1.6GHz) running Vista.
>>>>
>>>> Thank you for your counsel.
>>>>
>>>> Charles Annis, P.E.
>>>>
>>>> Charles.Annis at StatisticalEngineering.com
>>>> phone: 561-352-9699
>>>> eFax:  614-455-3265
>>>> http://www.StatisticalEngineering.com
>>>>
>>>>
>>>> -----Original Message-----
>>>> From: Prof Brian Ripley [mailto:ripley at stats.ox.ac.uk]
>>>> Sent: Wednesday, October 08, 2008 4:39 AM
>>>> To: Charles Annis, P.E.
>>>> Cc: r-help at r-project.org
>>>> Subject: Re: [R] trouble with character \u00e2
>>>>
>>>> You haven't given any of the information asked for in the posting guide.
>>>> But, assuming this is Windows in CP1252 (as I believe that has been your
>>>> locale before), it works for me in current R.
>>>>
>>>> plot(1:10)
>>>> file.label <- "foo"
>>>> savePlot(paste("diagnostic â vs a ", file.label, ".jpg",
>>>>          sep = ""), type = "jpg")
>>>>
>>>> If you are not using 2.8.0 beta or 2.7.2 patched, please check those.
>>>> This might be related to
>>>>
>>>>     o	file.path() did not work correctly in 2.7.0 if the
>> components
>>>> 	had different encodings.
>>>>
>>>> (NEWS for 2.7.1).
>>>>
>>>> On Sun, 5 Oct 2008, Charles Annis, P.E. wrote:
>>>>
>>>>> Greetings R-wizards:
>>>>>
>>>>> For historical reasons I have filenames with the character "â" and have
>>>>> successfully used "\u00e2" in its place, with the hoped-for result on
>> all
>>>> my
>>>>> on-screen plots.
>>>>>
>>>>> However since R2.7.0 I have trouble with savePlot() when the file name
>>>>> includes that character as it does in this example:
>>>>>
>>>>> savePlot(paste("diagnostic â vs a ", file.label, ".jpg",
>>>>>        sep = ""), type = "jpg")
>>>>>
>>>>> In R2.6.0 and earlier, R would ignore a dot ('.') in the file name and
>>>>> supply the extension.  Since R2.7.0 if filename does include a dot,
>>>>> savePlot() will  not add the file type as an extension.  Thus my
>> apparent
>>>>> redundancy in the file name.
>>>>>
>>>>> The problem I have is that the example command will substitute an
>>> unwanted
>>>>> character for â, yet if I use "File, save as, jpg ... " and type in a
>>> name
>>>>> containing the troublesome character, R saves the on-screen plot with
>>> that
>>>>> character in the name with no complaints.
>>>>>
>>>>> I have tried using iconv() with no success, as can be seen with the
>>>>> following code:
>>>>>
>>>>> file.name <- paste("diagnostic â vs a ", file.label, ".jpg", sep = "")
>>>>>
>>>>> iconv.List <- iconvlist()
>>>>>
>>>>> for(encoding in iconv.List) {
>>>>>
>>>>> print(iconv(file.name, "", encoding, ""))}
>>>>>
>>>>> So, here's the question:  How can I save, with a non-interactive R
>>>> command,
>>>>> an existing plot with the troublesome character in the file name?
>>>>>
>>>>> Thanks.
>>>>>
>>>>>
>>>>>
>>>>> Charles Annis, P.E.
>>>>>
>>>>> Charles.Annis at StatisticalEngineering.com
>>>>> phone: 561-352-9699
>>>>> eFax:  614-455-3265
>>>>> http://www.StatisticalEngineering.com
>>>>>  
>>>>>
>>>>> ______________________________________________
>>>>> R-help at r-project.org mailing list
>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>> PLEASE do read the posting guide
>>>> http://www.R-project.org/posting-guide.html
>>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>>
>>>>
>>>> --
>>>> Brian D. Ripley,                  ripley at stats.ox.ac.uk
>>>> Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
>>>> University of Oxford,             Tel:  +44 1865 272861 (self)
>>>> 1 South Parks Road,                     +44 1865 272866 (PA)
>>>> Oxford OX1 3TG, UK                Fax:  +44 1865 272595
>>>>
>>>>
>>>
>>> --
>>> Brian D. Ripley,                  ripley at stats.ox.ac.uk
>>> Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
>>> University of Oxford,             Tel:  +44 1865 272861 (self)
>>> 1 South Parks Road,                     +44 1865 272866 (PA)
>>> Oxford OX1 3TG, UK                Fax:  +44 1865 272595
>>>
>>>
>>
>> --
>> Brian D. Ripley,                  ripley at stats.ox.ac.uk
>> Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
>> University of Oxford,             Tel:  +44 1865 272861 (self)
>> 1 South Parks Road,                     +44 1865 272866 (PA)
>> Oxford OX1 3TG, UK                Fax:  +44 1865 272595
>>
>>
>
> -- 
> Brian D. Ripley,                  ripley at stats.ox.ac.uk
> Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
> University of Oxford,             Tel:  +44 1865 272861 (self)
> 1 South Parks Road,                     +44 1865 272866 (PA)
> Oxford OX1 3TG, UK                Fax:  +44 1865 272595
>
>

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595


More information about the R-help mailing list