[R] [R-pkgs] New R package sqldf

Gabor Grothendieck ggrothendieck at gmail.com
Wed Aug 1 17:27:26 CEST 2007


On 8/1/07, davidr at rhotrading.com <davidr at rhotrading.com> wrote:
> Extremely cool and useful!!!!
>
> I immediately saw uses for it in some of the work I do.
> So went to look at it. I especially liked the examples with joins!
>
> I observed a few small bugs in the documentation:
>
> - The help references
> The sqldf home page <URL: http://code.google.com/p/batchfiles/>
> but it should be of course .../p/sqldf

Thanks.  Its fixed in the svn repository now.  Will be part of next
release.

>
> - Example 5 there didn't work for me. Maybe I don't have something
> loaded?
>
> > minSL <- 7
> > limit <- 3
> > fn$sqldf("select * from iris where Sepal_Length > $minSL limit
> $limit")
> Error: attempt to apply non-function

Works for me.  What versions are you using?

> library(sqldf)
> minSL <- 7
> limit <- 3
> fn$sqldf("select * from iris where Sepal_Length > $minSL limit $limit")
  Sepal_Length Sepal_Width Petal_Length Petal_Width   Species
1          7.1         3.0          5.9         2.1 virginica
2          7.6         3.0          6.6         2.1 virginica
3          7.3         2.9          6.3         1.8 virginica
> packageDescription("gsubfn")$Version
[1] "0.3-3"
> packageDescription("sqldf")$Version
[1] "0-1.1"
> packageDescription("DBI")$Version
[1] "0.2-3"
> packageDescription("RSQLite")$Version
[1] "0.5-5"
> R.version.string # Windows XP
[1] "R version 2.5.1 (2007-06-27)"
>

Also, not shown, but example(fn) works for me.

Other things to try are try it on a new R session and if that still
does not work try sourcing it from the repository first:

library(gsubfn)
source("http://gsubfn.googlecode.com/svn/trunk/R/fn.R")
fn$sqldf("select * from iris where Sepal_Length > $minSL limit $limit")

Let me know if any of these things work or not.

>
> None of the examples in ?fn work for me either, giving the same error
> message. The examples in gsubfn work as do the rest of the sqldf
> examples.

As mentioned, they all work for me.  Let me know what the result
is of trying the previous suggestions.

>
> I am on Windows XP with R version 2.5.1
>
> > version
>               _
> platform       i386-pc-mingw32
> arch           i386
> os             mingw32
> system         i386, mingw32
> status
> major          2
> minor          5.1
> year           2007
> month          06
> day            27
> svn rev        42083
> language       R
> version.string R version 2.5.1 (2007-06-27)
>
> David L. Reiner
> Rho Trading Securities, LLC
>
>
> -----Original Message-----
> From: r-help-bounces at stat.math.ethz.ch
> [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Gabor
> Grothendieck
> Sent: Tuesday, July 31, 2007 7:43 PM
> To: r-packages at stat.math.ethz.ch
> Subject: [R] [R-pkgs] New R package sqldf
>
> sqldf is an R package for running SQL select
> statements on one or more R data frames. It is
> optimized for convenience making it useful
> for ad hoc queries against R data frames.
>
> Given an SQL select statement whose tables
> are the names of R data frames it:
>
> - sets up the database (by default it transparently
>  sets up an in memory SQLite database using RSQLite;
>  however, MySQL via RMySQL, can be specified as an
>  alternative.  MySQL has not been tested.)
> - imports the data frames found in SQL select
>  statement into the database
> - runs the SQL select statement
> - outputs the result back to a data frame
> - uses a heuristic to assign the appropriate column
>  classes to the result
> - removes the database
>
> so that all the user has to do is issue a one line
> function call with one argument, the select
> statement.
>
> Here is an example which processes an SQL select
> statement whose functionality is similar to the R
> aggregate function.  Note that although the iris
> dataset (which is built into R) uses the name
> Sepal.Length the R database interface, DBI, converts
> that to Sepal_Length.  Just install the sqldf package
> from CRAN and type these two lines into R without
> the > prompts:
>
> > library(sqldf)
> > sqldf("select Species, avg(Sepal_Length) from iris group by Species")
>
>     Species avg(Sepal_Length)
> 1     setosa             5.006
> 2 versicolor             5.936
> 3  virginica             6.588
>
> As can be seen from the example, there is:
> - no database setup
> - no importing and exporting into the database
> - no coercing of the returned columns to the
>  appropriate class (in most cases)
>
> It can be used:
> - as an alternate syntax for data frame manipulation
> - learning SQL if you know R
> - learning R if you know SQL
>
> The sqldf package has a single function, sqldf.  More
> information is available by issuing the command ?sqldf
> from within R.  More examples and useful links are
> available at the sqldf home page:
>
> http://code.google.com/p/sqldf/
>
> _______________________________________________
> R-packages mailing list
> R-packages at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/r-packages
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list