[R] Russian language in R

Duncan Murdoch murdoch.duncan at gmail.com
Sat May 14 02:08:55 CEST 2011


On 13/05/2011 4:57 PM, lyolya wrote:
> Hello,
>
> I am experiencing a problem in reading a database in Russian. The problem
> appears when it comes to char variables. I have already tried changing the
> encoding, i.e.
>
> options(encoding="UTF-8")
>
> and
>
> options(encoding="KOI8-R")
>
> but every time there appear to be something unreadable in the data frame,
> like \x82\xa2\xae\xef etc.
>
> Could you please answer whether it is possible to operate with Russian
> strings in R, and, if yes, how to get to do that. Thank you, in advance.

Yes, it is possible.  You can test it using a text editor that supports 
Russian.  Just put

x <- " some Russian text "

into the file, the use source() to read the filename.  Two things are 
likely outcomes:

x will be defined to be a string holding Russian text, and it will 
display properly.

OR

it will be defined to be a string with lots of escapes or mis-displayed 
characters in it.  In the latter case, the problem is that R is assuming 
a different encoding than your text editor.  The l10n_info() will 
display information about what R is expecting.

If none of the above helps you to get your code working, then you'll 
have to give details on exactly what you're doing to read the file, and 
exactly what is in the file.

Duncan Murdoch



More information about the R-help mailing list