[R] Not all rows are being read-in

Dimitri Liakhovitski dimitri.liakhovitski at gmail.com
Wed Mar 30 00:58:59 CEST 2011


Hello!

I have a tab-delimited .txt file (size 800MB) with about 3.4 million
rows and 41 columns. About 15 columns contain strings.
Tried to read it in in R 2.12.2 on a laptop that has Windows XP:
mydata<-read.delim(file="FileName.TXT",sep="\t")
R did not complain (!) and I got: dim(mydata) 1692063 41.
I looked at the same file in 2 other programs - one of them was SPSS.
Both of them show me that I have 3,374,050 rows and 41 columns. And
rows 1692063 and 1692064 are in no way different from each other.

Then I went to a large desktop with huge memory, Windows 7 for 64 bits
and tried the same thing with R 2.12.2 for 64 bits. Again, I got no
complaints from R and got the same number of rows (1692063)!
Then I tried to read in more rows (into the second data frame), with
the same code but with skip=1692064. It keeps reading in progressively
fewer and fewer rows (maybe because memory is full?).

Any advice - any chance for me to read in the whole file?
Thank you very much!

-- 
Dimitri Liakhovitski
Ninah Consulting



More information about the R-help mailing list