[R] factor levels with umlauts

Christian Bieli christian.bieli at unibas.ch
Fri Oct 6 10:22:22 CEST 2006


Hi all

I have to generate some test data for import in an sql database. The 
database is meant for web-based data entry in a study taking place in a 
german speaking region, so factor levels of the variables include umlauts.
The variables in the dataframe t.muster are generated e.g. like this:

t.muster$screening <- rep("ausgefüllt",50)

and exported to a .csv file by:

write.table(t.muster,"MakeMuster041006/MusterDaten.csv",
    col.names=FALSE,row.names=FALSE,na="",sep=";")

After export the factor level including an umlaut of t.muster$screening 
look like this in the sql-database as well as in an excel spreadsheet:

ausgefüllt

Looks like a conflict between encodings, but my locals are set correct 
in my discretion and I tried something like:
t.muster <- lapply(t.muster, iconv, "ISO8859-1", "ISO8859-15")

but it did not work.

my locals are:
 > Sys.getlocale()
[1] 
"LC_COLLATE=German_Switzerland.1252;LC_CTYPE=German_Switzerland.1252;LC_MONETARY=German_Switzerland.1252;
LC_NUMERIC=C;LC_TIME=German_Switzerland.1252"

and I am running R on:

 > R.version
               _                        
platform       i386-pc-mingw32          
arch           i386                     
os             mingw32                  
system         i386, mingw32            
status                                  
major          2                        
minor          3.1                      
year           2006                     
month          06                       
day            01                       
svn rev        38247                    
language       R                        
version.string Version 2.3.1 (2006-06-01)

I'd be glad if someone could help me out. Thanks in advance.
Christian



More information about the R-help mailing list