[R] How to replace German umlauts in strings?

Hofert Marius m_hofert at web.de
Thu Apr 10 18:03:53 CEST 2008


Dear R-users,

I have a file containing names of German students. These names  
contain the characters "ä", "ö" or "ü" (German umlauts). I use  
read.table() to read the file and let's assume the table is then  
stored in a variable called "data". The names are then contained in  
the first column, i.e. data[,1]. Now if I simply display the variable  
"data", I see, that "ä" is replaced by \x8a, "ö" is replaced by \x9a  
and so forth. Now, I would like to have these characters replaced by  
their LaTeX (or TeX) equivalents, meaning \x8a should be replaced by  
\"a, \x9a should be replaced by \"o and so forth. I tried a lot,  
especially with gsub(), however, the backslashes cause problems and I  
do not know how to get this to work. The data.frame should then be  
written to a file without destroying the replaced substrings (so that  
indeed \"a appears in the file for \x8a). Is this possible?

Here is a minimal example:
data=data.frame(names=c("Bj\x9arn","S\x9aren"),points=c 
(10,20),stringsAsFactors=F)
data[1,1]=gsub('\\x9a','\\"o',data[1,1]) #does not work! (neither do  
similar calls)

Thanks in advance

Marius


More information about the R-help mailing list