[R] Reading fixed column format

Gabor Grothendieck ggrothendieck at gmail.com
Wed Sep 13 08:01:43 CEST 2006


I know you would prefer a 100% R solution but using the unix cut
command (a Windows version is available in tools.zip at:
http://www.murdoch-sutherland.com/Rtools/
) is really easy.  Maybe if you preprocessed it with that you
could then use read.fwf.

For example, look how easy it was to cut this file down to half
extracting columns 2-3 and 6-8:

C:\bin>type a.dat
123456789
123456789
123456789

C:\bin>cut -c2-3,6-8 a.dat
23678
23678
23678


On 9/13/06, Anupam Tyagi <AnupTyagi at yahoo.com> wrote:
> Barry Rowlingson <B.Rowlingson <at> lancaster.ac.uk> writes:
>
>
> > > None of these seem to read non-coniguous variables from columns; or
> > > may be I am missing something. "read.fwf" is not meant for large
> > > files according to a post in the archives. Thanks for the pointers. I
> > > have read the R data input and output. Anupam.
> >
> >   First up, how 'large' is your 'large ASCII file'? How many rows and
> > columns?
>
> There are 356,112 records, 326 variables, fixed record length of 1283 positions.
> Zipped file is 42MB. There are no field (variable) separaters (delimiters).
>
> >   Secondly, what are 'non-contiguous' variables?
>
> Variables that are not in adjoining positions in the file: reading them from the
> file would require skipping columns while reading. For example, below are the
> start positions of the first three variables I would like to read.
>
> StartingColumn  VariableName    FieldLength
> 1       STATE   2
> 24      INTVID  3
> 30      PSU     10
>
>
> >   Perhaps if you posted the first few lines and columns of the file then
> > we might get an idea of how to read it in.
>
> Because a record (row) of the file is 1283 columns, I would not like to post it
> here.
>
> Thank you for your response.
>
> Anupam.
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list