[R] Reading Chinese Language (GB2312) Input

jgreenb1 greenberg.jon at gmail.com
Fri Oct 26 19:25:06 CEST 2012


I am trying to read a csv file with Chinese language text in it. The file
should look like this:

userid,jobid,Title,companyid,industryids1
82497,1160,互联网产品经理,12
96429,658,企划经理(商业公司),24
14471,95,产品运营经理,25,6
14471,1708,产品营销高级经理,727,2
14471,1558,产品总监,611,4
14471,1777,产品总监,743,1
14471,1697,产品经理,725,234
14471,1716,度假产品总监 ,730,234
14471,1717,产品经理,730,5
but when I read the data in using read.csv() it looks like this in the R
console:

  userid jobid                Title companyid industryids1
1  82497  1160       »¥ÁªÍø²úÆ·¾­Àí        12           NA
2  96429   658 Æó»®¾­Àí£¨ÉÌÒµ¹«Ë¾£©        24           NA
3  14471    95         ²úÆ·ÔËÓª¾­Àí        25            6
4  14471  1708     ²úÆ·ÓªÏú¸ß¼¶¾­Àí       727            2
5  14471  1558             ²úÆ·×Ü¼à       611            4
6  14471  1777             ²úÆ·×Ü¼à       743            1
7  14471  1697             ²úÆ·¾­Àí       725          234
8  14471  1716        ¶È¼Ù²úÆ·×Ü¼à        730          234
9  14471  1717             ²úÆ·¾­Àí       730            5
How can I read this in properly?

Session info:

R version 2.14.1 (2011-12-22)
Platform: x86_64-pc-mingw32/x64 (64-bit)
locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United
States.1252   
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C                          
[5] LC_TIME=English_United States.1252    
attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     
loaded via a namespace (and not attached):
[1] tools_2.14.1



--
View this message in context: http://r.789695.n4.nabble.com/Reading-Chinese-Language-GB2312-Input-tp4647581.html
Sent from the R help mailing list archive at Nabble.com.



More information about the R-help mailing list