[Rd] read.table produces extra rows when file contains extra columns on (PR#9128)

smyth at wehi.EDU.AU smyth at wehi.EDU.AU
Sun Aug 6 04:10:28 CEST 2006


Reading the following delimited file with read.csv() or read.table()

file1:
X,Y
1,2
2,4
3,6
4,8
5,10,,
6,12

produces a data.frame with 7 rows instead of 6 because the two extra values on line 6 of the file
are pushed into a new row of the data.frame.  In other words, the extra columns on line 6 are
interpreted as a second case on the same line.  This contradicts the help ?read.table which states
that cases correspond to lines.

A desirable behaviour might be to ignore the extra columns with a warning.  It would be nice
though to be consistent with the behaviour reading the shorter file

file2:
X,Y
1,2
2,4,,
3,6

which currently produces an error.

Gordon


> read.csv("file1.csv")
   X  Y
1  1  2
2  2  4
3  3  6
4  4  8
5  5 10
6 NA NA
7  6 12
> read.csv("file2.csv")
Error in read.table(file = file, header = header, sep = sep, quote = quote,  :
        more columns than column names
> sessionInfo()
Version 2.3.1 (2006-06-01)
i386-pc-mingw32

attached base packages:
[1] "methods"   "stats"     "graphics"  "grDevices" "utils"     "datasets"  "base"



More information about the R-devel mailing list