[R] retaining characters in a csv file

Rolf Turner r.turner at auckland.ac.nz
Wed Sep 23 03:26:58 CEST 2015


On 23/09/15 11:19, peter dalgaard wrote:
>
>> On 23 Sep 2015, at 00:33 , Rolf Turner <r.turner at auckland.ac.nz> wrote:
>>
>
> [read.csv() doesn't distinguish "123.4" from 123.4]
>
>> IMHO this is a bug in read.csv().
>>
>
> Dunno about that:
>
> pd$ cat ~/tmp/junk.csv
> "1";1
> 2;"2"
> pd$ open !$
> open ~/tmp/junk.csv
>
> And lo and behold, Excel opens with
>
> 1 1
> 2 2
>
> and all cells numeric.

I would say that this phenomenon ("Excel does it") is *overwhelming* 
evidence that it is bad practice!!! :-)

> I don't think the CSV standard (if there is one...) specifies that
> quoted strings are necessarily text.

Duncan Murdoch has pointed out that this is definitely *not* the case.

> I think we have been here before, and found that even if we decide
> that it is a bug (or misfeature), it would be hard to change, because
> the modus operandi of read.* is to first read everything as character
> and _then_ see (in type.convert()) which entries can be converted to
> numeric, logical, etc.

As Arunkumar Srinivasan has pointed out, fread() from the data.table 
package can handle this, so it is *not impossible*.

cheers,

Rolf

-- 
Technical Editor ANZJS
Department of Statistics
University of Auckland
Phone: +64-9-373-7599 ext. 88276



More information about the R-help mailing list