[R] iconv question: SQL Server 2005 to R

Prof Brian Ripley ripley at stats.ox.ac.uk
Wed Oct 9 11:16:08 CEST 2013


'Unicode' is a not an encoding.  As the help says

fileEncoding: character string: if non-empty declares the encoding used
           on a file (not a connection) so the character data can be
           re-encoded.  See the ‘Encoding’ section of the help for
           ‘file’, the ‘R Data Import/Export Manual’ and ‘Note’.

The first of the cross references explains this.

On 09/10/2013 00:02, Ira Sharenow wrote:
> A colleague is sending me quite a few files that have been saved with MS
> SQL Server 2005. I am using R 2.15.1 on Windows 7.

See the posting guide: your R update is overdue as there have been 5 
releases since then.

> I am trying to read in the files using standard techniques. Although the
> file has a csv extension when I go to Excel or WordPad and do SAVE AS I
> see that it is Unicode Text. Notepad indicates that the encoding is
> Unicode. Right now I have to do a few things from within Excel (such as
> Text to Columns) and eventually save as a true csv file before I can
> read it into R and then use it.
>
> Is there an easy way to solve this from within R? I am also open to easy
> SQL Server 2005 solutions.
>
> I tried the following from within R.
>
> testDF = read.table("Info06.csv", header = TRUE, sep = ",")
>
>> testDF2 =  iconv(x = testDF, from = "Unicode", to = "")
>
> Error in iconv(x = testDF, from = "Unicode", to = "") :
>
> unsupported conversion from 'Unicode' to '' in codepage 1252
>
> # The next line did not produce an error message
>
>> testDF3 =  iconv(x = testDF, from = "UTF-8" , to = "")
>
>> testDF3[1:6,  1:3]
>
> Error in testDF3[1:6, 1:3] : incorrect number of dimensions
>
> # The next line did not produce an error message
>
>> testDF4 =  iconv(x = testDF, from = "macroman" , to = "")
>
>> testDF4[1:6,  1:3]
>
> Error in testDF4[1:6, 1:3] : incorrect number of dimensions
>
>>   Encoding(testDF3)
>
> [1] "unknown"
>
>>   Encoding(testDF4)
>
> [1] "unknown"
>
> This is the first few lines from WordPad
>
> Date,StockID,Price,MktCap,ADV,SectorID,Days,A1,std1,std2
>
> 2006-01-03
> 00:00:00.000, at Stock1,2.53,467108197.38,567381.144444444,4,133.14486997089,-0.0162107939626307,0.0346283580367959,0.0126471695454834
>
> 2006-01-03
> 00:00:00.000, at Stock2,1.3275,829803070.531114,6134778.93292,5,124.632223896458,0.071513138376339,0.0410694546850102,0.0172091268025929
>
>
> 	[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595



More information about the R-help mailing list