[Rd] On read.csv and write.csv

Fri Jul 2 00:47:09 CEST 2021

Dear Gabriel,

On 2021-07-01 6:29 p.m., Gabriel Becker wrote:
> On Thu, Jul 1, 2021 at 1:46 PM Stephen Ellison <S.Ellison using lgcgroup.com>
> wrote:
> 
>>
>> Please run the reproducible example provided.
>> When you do, you will see that write.csv writes an unnecessary empty
>> header field ("") over the row names column. This makes the number of
>> header fields equal to the number of columns _including_ row names. That
>> causes the original row names to be read as data by read.csv, following the
>> rule that the number of header fields determines whether row names are
>> present. read.csv  accordingly assumes that the former row names are
>> unnamed data, calls the unnamed row names column "X" (or X.1 etc if X
>> exists) and then adds new, default, row names _instead of the original row
>> names written by write.csv_.
>> That's not helpful.
>>
> 
> This depends on if you are reading the csv via R or something else, I would
> imagine. It not being "valid" CSV at all would likely cause some programs
> to choke entirely, I expect. I admit that's conjecture though, I don't have
> data on that one way or another.

On Excel, for example, opening a .csv file without the empty initial 
field in the first line will cause the column names to be misaligned.

As others have pointed out, .csv files are meant as a sort of 
least-common-denominator of data exchange, and so following the standard 
is probably a good idea.

Best,
  John

John Fox, Professor Emeritus
McMaster University
Hamilton, Ontario, Canada
web: https://socialsciences.mcmaster.ca/jfox/

> 
> ~G
> 
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-devel using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>