[R] unicode&pdf font problem RESOLVED

Ben Madin lists at remoteinformation.com.au
Tue Mar 1 14:50:29 CET 2011


Just to add to this (I've been looking through the archive) problem with display unicode fonts in pdf document in R

If you can use the Cairo package to create pdf on Mac, it seems quite happy with pushing unicode characters through (probably still font family dependant whether it will display)

	probstring <- c(' \u2264 0.2',' \u2268 0.4',' \u00FC 0.6',' \u2264 0.8',' \u2264 1.0')
	Cairo(type='pdf', file='outputs/demo.pdf', width=9,height=12, units='in', bg='transparent')
	plot(1:5,1:5, type='n')
	text(1:5,1:5,probstring)  
	dev.off()

?Cairo suggests encoding is ignored if you do try to set it.

cheers

Ben



On 14/01/2011, at 7:00 PM, r-help-request at r-project.org wrote:

> Date: Thu, 13 Jan 2011 10:47:09 -0500
> From: David Winsemius <dwinsemius at comcast.net>
> To: Sascha Vieweg <saschaview at gmail.com>
> Cc: r-help at r-project.org
> Subject: Re: [R] unicode&pdf font problem RESOLVED
> Message-ID: <74FA099F-4CE5-45C7-A05A-4A1DE6C87EC8 at comcast.net>
> Content-Type: text/plain; charset=ISO-8859-1; format=flowed; delsp=yes
> 
> 
> On Jan 13, 2011, at 10:41 AM, Sascha Vieweg wrote:
> 
>> I have many German umlauts in my data sets and code them UTF-8. When  
>> it comes to plotting on pdf, I figured out that "CP1257" is a good  
>> choice to output Umlauts. I have no experiences with "CP1250", but  
>> maybe this small hint helps:
>> 
>> pdf(file=paste(sharepath, "/filename.pdf", sep=""), 9, 6, pointsize  
>> = 11, family = "Helvetica", encoding = "CP1257")
> 
> Just an FYI for the archives, that encoding fails with  
> pdf(encoding="CP1257") on a Mac when printing that target umlaut.
> 
> David.
>> 
>> *S*
>> 
>> On 11-01-13 16:17, tdenes at cogpsyphy.hu wrote:
>> 
>>> Date: Thu, 13 Jan 2011 16:17:04 +0100 (CET)
>>> From: tdenes at cogpsyphy.hu
>>> To: David Winsemius <dwinsemius at comcast.net>
>>> Cc: r-help at r-project.org
>>> Subject: Re: [R] unicode&pdf font problem RESOLVED
>>> 
>>> Dear David,
>>> 
>>> Thank you for your efforts. Inspired by your remarks, I started a new
>>> google-search and found this:
>>> http://stackoverflow.com/questions/3434349/sweave-not-printing-localized-characters
>>> 
>>> SO HERE COMES THE SOLUTION (it works on both OSs):
>>> 
>>> pdf.options(encoding = "CP1250")
>>> pdf()
>>> plot(1,type="n")
>>> text(1,1,"\U0171")
>>> dev.off()
>>> 
>>> CP1250 should work for all Central-European languages:
>>> http://en.wikipedia.org/wiki/Windows-1250
>>> 
>>> 
>>> Thank you again,
>>> Denes
>>> 
>>> 
>>> 
>>>> 
>>>> On Jan 13, 2011, at 7:01 AM, tdenes at cogpsyphy.hu wrote:
>>>> 
>>>>> 
>>>>> Hi!
>>>>> 
>>>>> Sorry for the missing specs, here they are:
>>>>>> version
>>>>>             _
>>>>> platform       i386-pc-mingw32
>>>>> arch           i386
>>>>> os             mingw32
>>>>> system         i386, mingw32
>>>>> status
>>>>> major          2
>>>>> minor          12.1
>>>>> year           2010
>>>>> month          12
>>>>> day            16
>>>>> svn rev        53855
>>>>> language       R
>>>>> version.string R version 2.12.1 (2010-12-16)
>>>>> 
>>>>> OS: Windows 7 (English version, 32 bit)
>>>>> 
>>>>> 
>>>> 
>>>> You are after what Adobe calls: udblacute; 0171.  It is recognized  
>>>> in
>>>> the list of adobe glyphs:
>>>>> str(tools::Adobe_glyphs[371, ])
>>>> 'data.frame':	1 obs. of  2 variables:
>>>> $ adobe  : chr "udblacute"
>>>> $ unicode: chr "0171"
>>>> 
>>>> Consulted the help pages
>>>> points {graphics}
>>>> postscript {grDevices}
>>>> pdf {grDevices}
>>>> charsets {tools}
>>>> postscriptFonts {grDevices}
>>>> 
>>>> I have tried a variety of the pdfFonts installed on my Mac without
>>>> success. You can perhaps make a list of fonts on your machines with
>>>> names(pdfFonts()). Perhaps the range of fonts and the glyphs they
>>>> contain is different on your machines. I get consistently warning
>>>> messages saying there is a conversion failure:
>>>> 
>>>>> pdf("trial.pdf", family="Helvetica")
>>>> # also tried with font="Helvetica" but I think that is erroneous
>>>>> plot(1,type="n")
>>>>> text(1,1,"print \U0170\U0171")
>>>> Warning messages:
>>>> 1: In text.default(1, 1, "print ????") :
>>>>  conversion failure on 'print ????' in 'mbcsToSbcs': dot  
>>>> substituted
>>>> for <c5>
>>>> 2: In text.default(1, 1, "print ????") :
>>>>  conversion failure on 'print ????' in 'mbcsToSbcs': dot  
>>>> substituted
>>>> for <b0>
>>>> 3: In text.default(1, 1, "print ????") :
>>>>  conversion failure on 'print ????' in 'mbcsToSbcs': dot  
>>>> substituted
>>>> for <c5>
>>>> 4: In text.default(1, 1, "print ????") :
>>>>  conversion failure on 'print ????' in 'mbcsToSbcs': dot  
>>>> substituted
>>>> for <b1>
>>>> 5: In text.default(1, 1, "print ????") :
>>>>  font metrics unknown for Unicode character U+0170
>>>> 6: In text.default(1, 1, "print ????") :
>>>>  font metrics unknown for Unicode character U+0171
>>>> 7: In text.default(1, 1, "print ????") :
>>>>  conversion failure on 'print ????' in 'mbcsToSbcs': dot  
>>>> substituted
>>>> for <c5>
>>>> 8: In text.default(1, 1, "print ????") :
>>>>  conversion failure on 'print ????' in 'mbcsToSbcs': dot  
>>>> substituted
>>>> for <b0>
>>>> 9: In text.default(1, 1, "print ????") :
>>>>  conversion failure on 'print ????' in 'mbcsToSbcs': dot  
>>>> substituted
>>>> for <c5>
>>>> 10: In text.default(1, 1, "print ????") :
>>>>  conversion failure on 'print ????' in 'mbcsToSbcs': dot  
>>>> substituted
>>>> for <b1>
>>>> 
>>>> And this is despite my system saying the \U0170 and \U0171 are  
>>>> present
>>>> in the Helvetica font. Also tried family=URWHelvetica and
>>>> family=NimbusSanand and a bunch of others without success, but my  
>>>> last
>>>> best hope after reading the material in help(postscript) in the
>>>> "Families" section had been NimbusSan.  There is also information on
>>>> that page regarding encodings that appears to be very machine  
>>>> specific.
>>>> 
>>>>> 
>>>>> Note that \U0171 != ??. See
>>>>> http://www.fileformat.info/info/unicode/char/171/index.htm
>>>>> Anyway, I have no problem with ű (~u") and other special
>>>>> Hungarian
>>>>> characters in my R-Gui. It is correctly displayed in the console,  
>>>>> in
>>>>> plots, etc. The problem is with the pdf conversion.
>>>>> 
>>>>> The same holds for my Ubuntu Hardy Heron system*, with exactly the
>>>>> same
>>>>> error messages as reported in an earlier thread
>>>>> http://www.mail-archive.com/r-help@r-project.org/msg89792.html
>>>>> As far as I know, Hershey fonts do not contain \U0171.
>>>>> 
>>>>> 
>>>>> Regards,
>>>>> Denes
>>>>> 
>>>>> * The specs of Ubuntu:
>>>>>> version
>>>>>             _
>>>>> platform       x86_64-pc-linux-gnu
>>>>> arch           x86_64
>>>>> os             linux-gnu
>>>>> system         x86_64, linux-gnu
>>>>> status
>>>>> major          2
>>>>> minor          12.0
>>>>> year           2010
>>>>> month          10
>>>>> day            15
>>>>> svn rev        53317
>>>>> language       R
>>>>> version.string R version 2.12.0 (2010-10-15)
>>>>> 
>>>>> 
>>>>>> 
>>>>>> On Jan 12, 2011, at 11:11 PM, tdenes at cogpsyphy.hu wrote:
>>>>>> 
>>>>>>> 
>>>>>>> Dear List,
>>>>>>> 
>>>>>>> I would like to print a plot into pdf. The problem is that the
>>>>>>> character
>>>>>>> \U0171 is replaced by a simple 'u' (i.e. without accents) in  
>>>>>>> the pdf
>>>>>>> file.
>>>>>>> 
>>>>>>> Example:
>>>>>>> # this works fine
>>>>>>> plot(1,type="n")
>>>>>>> text(1,1,"print \U0171")
>>>>>>> 
>>>>>>> # this fails
>>>>>>> pdf("trial.pdf")
>>>>>>> plot(1,type="n")
>>>>>>> text(1,1,"print \U0171")
>>>>>>> dev.off()
>>>>>> 
>>>>>> Have you tried:
>>>>>> 
>>>>>> pdf("trial.pdf")
>>>>>> plot(1,type="n")
>>>>>> text(1,1,"print ??")
>>>>>> dev.off()
>>>>>> 
>>>>>> Your default screen fonts may not be the same as your default pdf
>>>>>> fonts. A lot depends on system specifics, none of which have you
>>>>>> provided.
>>>>>> 
>>>>>> 
>>>>>>> 
>>>>>>> I found an earlier post at
>>>>>>> http://www.mail-archive.com/r-help@r-project.org/msg65541.html,  
>>>>>>> but
>>>>>>> it is
>>>>>>> too hard to understand at my R-level. Any help is appreciated.
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> David Winsemius, MD
>>>>>> West Hartford, CT
>>>>>> 
>>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>> 
>>>> David Winsemius, MD
>>>> West Hartford, CT
>>>> 
>>>> 
>>> 
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>> 
>> 
>> -- 
>> Sascha Vieweg, saschaview at gmail.com
> 
> David Winsemius, MD
> West Hartford, CT



More information about the R-help mailing list