[R] Failing on reading a "slightly big" dataset

Prof Brian Ripley ripley at stats.ox.ac.uk
Mon Jul 5 12:25:00 CEST 2004


You are asking read.table to interpret both quote and comment characters
in your file.  You do seem to have quotes -- are they always matched?

Please read through the Data Import/Export manual and check out all the 
options.

On Mon, 5 Jul 2004, Ajay Shah wrote:

> I have a file with 4 columns per line, all pipe delimited.
> 
> $ wc -l cmie_firm_data.text 
> 89325 cmie_firm_data.text
> $ ls -al cmie_firm_data.text 
> -rw-r--r--    1 ajayshah ajayshah  4415637 Jul  5 15:25 cmie_firm_data.text
> $ awk -F\| '(NF != 4)' cmie_firm_data.text 
> $ head cmie_firm_data.text 
> All figures are for the year 20030331|||
> Company|GVA Less Interest (Rs. thousand)|Interest (Rs. thousand)|GVA (Rs. thousand)
> 'R' INVEST PVT. LTD.|-510.45|0.18|-510.27
> 20 MICRONS LTD.|60700|41200|101900
> 20TH CENTURY FOX CORPN. (INDIA) PVT. LTD.|50|0.33|50.33
> 21ST CENTURY AUTOMOTIVE INDIA LTD.|201.14|0.19|201.33
> 21ST CENTURY ENTERTAINMENT PVT. LTD.|-6.10|0|-6.10
> 21ST CENTURY EQUIPMENTS PVT. LTD.|-1599.53|1262.76|-336.77
> 21ST CENTURY INFRASTRUCTURE (INDIA) PVT. LTD.|140.48|1.74|142.22
> 21ST CENTURY PEST CONTROL SERVICES LTD.|50.21|7.13|57.34
> 
> When I try to read this into R, I get a mysterious error, and then it
> reads only 38,244 observations. Any idea what might be going wrong?

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595




More information about the R-help mailing list