[Rd] iconv to UTF-16 encoding produces error due to embedded nulls (write.table with fileEncoding param)

peter dalgaard pdalgd at gmail.com
Thu Feb 25 10:49:35 CET 2016


Aim for 3.3.1 then? It's not like we have hordes of people demanding to have this fixed right here and now, or do we? 

(A practical problem is that the version control dynamics dictate that at this stage, commits to r-devel _will_ end up in 3.3.0 on April 14, unless backed out and then inserted in the new r-devel branch to be created on March 17.) 

- Peter


On 24 Feb 2016, at 21:49 , Duncan Murdoch <murdoch.duncan at gmail.com> wrote:

> On 24/02/2016 11:16 AM, Duncan Murdoch wrote:
>> On 24/02/2016 9:55 AM, Mikko Korpela wrote:
>>>> 
[...]
>>> 
>>> That's unfortunate. I tested my tiny patch on Linux. I don't know what
>>> kind of additional changes would be needed to make this work on Windows.
>>> 
>> 
>> It looks like a big change is needed for a perfect solution:
>> 
>>   - Windows does the translation of \n to \r\n.  In the R code, Windows
>> is never told that the output is UTF-16LE, so it does an 8 bit translation.
>> 
>>   - Telling Windows that output is UTF-16LE looks hard:  we'd need to
>> convert the string to wide chars in R, then write it in wide chars.
>> This seems like a lot of work for a rare case.
>> 
>>   - It might be easier to do a hack:  if the user asks for "UTF-16LE",
>> then treat it internally as a text file but tell Windows it's a binary
>> file.  This means no \n to \r\n translation will be done by Windows.  If
>> the desired output file needs Windows line endings, the user would have
>> to specify sep="\r\n" in writeLines.
> 
> A third possibility is to handle the insertion of the \r completely within R.  This will have the advantage of making it optional, so it would be a lot easier to write a Unix-style file on Windows.
> 
> I think either the first or third possibilities will take too much time for me to attempt them before 3.3.0.  I'm not sure about the second one yet.
> 
> Duncan Murdoch
> 
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd.mes at cbs.dk  Priv: PDalgd at gmail.com



More information about the R-devel mailing list