[R] sqldf file specification, non-ASCII

Gabor Grothendieck ggrothendieck at gmail.com
Thu Apr 3 19:57:40 CEST 2008


The Windows version is on sourceforge.

On Thu, Apr 3, 2008 at 1:29 PM, Peter Jepsen <PJ at dce.au.dk> wrote:
> Thank you for your help, Duncan and Gabor. Yes, I found an early line
> feed in line 1562740, so I have corrected that error. The thing is, it
> takes me many, many hours to save the file, so I would like to confirm
> that there are no more errors further down the file. The ffe tool sounds
> like a perfect tool for this job, but it doesn't seem to be available
> for Windows. Is anybody out there aware of a similar Windows tool?
>
> Thank you again for your help.
> Peter.
>
>
> -----Original Message-----
> From: Gabor Grothendieck [mailto:ggrothendieck at gmail.com]
> Sent: 3. april 2008 17:08
> To: Peter Jepsen
> Subject: Re: [R] sqldf file specification, non-ASCII
>
> One other thing you could try would be to run it through
> ffe (fast file extractor) which is a free utility that you can
> find via google.  Use the ffe's loose argument.  It can find
> bad lines and since its not dependent on R would give
> you and independent check.  Regards.
>
> On Thu, Apr 3, 2008 at 10:36 AM, Gabor Grothendieck
> <ggrothendieck at gmail.com> wrote:
> > Hi, Can you try it with the first 100 lines, say, of the data and
> > also try reading it with read.csv to double check your arguments
> > (note that sql args are similar but not entirely identical to
> read.csv)
> > and if it still gives this error send me that 100 line file and I will
> > look at it tonight or tomorrow.  Regards.
> >
> >
> > On Thu, Apr 3, 2008 at 10:22 AM, Peter Jepsen <PJ at dce.au.dk> wrote:
> > > Dear R-Listers,
> > >
> > > I am a Windows user (R 2.6.2) using the development version of sqldf
> to
> > > try to read a 3GB file originally stored in .sas7bdat-format. I
> convert
> > > it to comma-delimited ASCII format with StatTransfer before trying
> to
> > > import just the rows I need into R. The problem is that I get this
> > > error:
> > >
> > > > f <- file("hugedata.csv")
> > > > DF <- sqldf("select * from f where C_OPR like 'KKA2%'",
> > > file.format=list(header=T, row.names=F))
> > > Error in try({ :
> > >  RS-DBI driver: (RS_sqlite_import: hugedata.csv line 1562740
> expected
> > > 52 columns of data but found 19)
> > > Error in sqliteExecStatement(con, statement, bind.data) :
> > >  RS-DBI driver: (error in statement: no such table: f)
> > >
> > > Now, I know that my SAS-using colleagues are able to use this file
> with
> > > SAS, so I was wondering whether StatTransfer'ing it to the SAS XPORT
> > > format which can be read with the 'read.xport' function in the
> 'foreign'
> > > package would be a better approach. The problem is, I don't know
> > > how/whether I can do that at all with sqldf. I tried various ways
> like
> > > f <- file(read.xport("hugedata.xport"))
> > > but I consistently got an error message from the sqldf command. I
> don't
> > > recall the exact error message, unfortunately, but can anybody tell
> me
> > > whether it is at all possible to read in files in non-ASCII format
> > > without having to put them in R memory?
> > >
> > > Thank you for your assistance.
> > > Peter.
> > >
> > > ______________________________________________
> > > R-help at r-project.org mailing list
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > > and provide commented, minimal, self-contained, reproducible code.
> > >
> >
>



More information about the R-help mailing list