[R] Using sqldf() to read in .fwf files

Doran, Harold HDoran at air.org
Mon Sep 15 21:23:40 CEST 2014


Thank you, Gabor. This has seemingly resolved the issue. Perhaps a quick follow up. Suppose I know that the 1st variable I am reading in is to be numeric and the second is character. Can that be specified in the substr() argument?

sqldf("select substr(V1, 1, 1) f1, substr(V1, 2, 4) f2 from fixed")

-----Original Message-----
From: Gabor Grothendieck [mailto:ggrothendieck at gmail.com] 
Sent: Monday, September 15, 2014 12:42 PM
To: Doran, Harold
Cc: r-help at r-project.org
Subject: Re: [R] Using sqldf() to read in .fwf files

On Mon, Sep 15, 2014 at 12:09 PM, Doran, Harold <HDoran at air.org> wrote:
> I am learning to use sqldf() to read in very large fixed width files 
> that otherwise do not work efficiently with read.fwf. I found the 
> following example online and have worked with this in various ways to 
> read in the data
>
> cat("1 8.3
> 210.3
> 319.0
> 416.0
> 515.6
> 719.8
> ", file = "fixed")
>
> fixed <- file("fixed")
> sqldf("select substr(V1, 1, 1) f1, substr(V1, 2, 4) f2 from fixed")
>
> I then applied this to my real world data problem though it yields the following error message and I am not sure how to interpret this.
>
> dor <- file("dor")
>> sqldf("select substr(V1, 1, 1) f1, substr(V1, 2, 4) f2 from dor")
> Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,  :
>   line 1 did not have 6 elements
>
> Looking at my .fwf. data in a text editor shows the data are structured as I would expect. In fact, I can read in the first few lines of the file using read.fwf and the data are as I would expect after being read into R.
>

We want it to regard the entire line as one field so specify sep= as some character not in the file.

    attr(fixed, "file.format") <- list(sep = ";")


--
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com


More information about the R-help mailing list