[R] Passing date as parameter while retrieving data from database using dbGetQuery

Gabor Grothendieck ggrothendieck at gmail.com
Wed Feb 15 15:26:19 CET 2012


On Wed, Feb 15, 2012 at 7:24 AM, aajit75 <aajit75 at yahoo.co.in> wrote:
>
> Hi All,
> This might be simple question, I need to retrive data for modelling from the
> databases. Eveytime date values changes so I countnot fix date value in the
> code, it is required to pass as parameter.
> When I pass the date as parameter, it throws error.
> (ERROR: column "start_dt" does not exist  Position: 285)
> My script is as below, please guide me where am I going wrong?
> All parameters are passed correctly, when start_dt and end_dt are replaced
> by  '2010-11-01' and '2011-01-31' respectively in the query code works fine
> without any errors.
> #####################################################################
> db_driver <- mydir$db_driver
> db_jar_file <- mydir$db_jar_file
> db_server <- mydir$db_server
> db_server_lgn <- mydir$db_server_lgn
> db_server_pwd <- mydir$db_server_pwd
>
> library(RJDBC)
> .jinit(classpath="myClasses.jar", parameters="-Xmx4096m")
>
> drv <- JDBC(paste(db_driver,  sep = ""),
>           paste(db_jar_file,  sep = ""),
>           identifier.quote="`")
>
> conn <- dbConnect(drv, paste(db_server,  sep = ""),
>                      paste(db_server_lgn,  sep = ""),
>                      paste(db_server_pwd,  sep = ""))
>
> start_dt <- as.Date('2010-11-01',format="%Y-%m-%d")
> end_dt <- as.Date('2011-01-31',format="%Y-%m-%d")
>
> library(sqldf)
> target_population <- dbGetQuery(conn,
> "select distinct
> a.primary_customer_code as cust_id,
> a.primary_product_code,
> a.account_opening_date,
> b.l4_product_hierarchy_code,
> b.l5_product_hierarchy_code
> from account_dim a,
> product_dim b
> where a.primary_product_code=b.l5_product_hierarchy_code
> and a.account_opening_date between start_dt and end_dt")
> ####################################################################
>
> As it is not possible to reproduce error with the above code, I am providing
> sample example as below with sqldf function using dataframe.
>
> date_tm <- as.Date(c('2010-11-01', '2011-11-01','2010-12-01', '2011-01-01',
> '2011-02-01'))
> x1 <- c(1,2,3,4,5)
> x2 <- c(100,200,300,400,500)
>
> test_data <- data.frame(x1,x2,date_tm)
>
> test_data
>
> start_dt <- as.Date('2011-01-01',format="%Y-%m-%d") #Passing as parameter
> end_dt <- as.Date('2011-02-31',format="%Y-%m-%d") #Passing as parameter
>
> library(sqldf)
> new_data  <-
> sqldf("select *
> from test_data
> where date_tm  = start_dt")
> It shows similar error, when date is passed by parameter start_dt
> (error in statement: no such column: start_dt)

As the error message says you are attempting to reference column
start_dt but there is no such column in  your data frame.

Perhaps you want this if you are using sqlite:

library(sqldf)

s <- paste("select * from test_data where date_tm =", as.numeric(start_dt))
sqldf(s, verbose = TRUE)

or this with RPostgreSQL:

library(sqldf)
library(RPostgreSQL)

s <- sprintf(" select * from test_data where date_tm = '%s' ", start_dt)
sqldf(s, verbose = TRUE)

verbose = TRUE reports which database its using to enable one to
double check this. Which database sqldf uses depends on which R
database drivers are loaded (if RPostgreSQL is loaded as shown above
then it will use that or if not then it will use sqlite) or you can
force it with the sqldf drv= argument or via the "sqldf.driver"
option.  For more info on drv= and the options be sure to read ?sqldf
and for info on using PostgreSQL with sqldf be sure to read:
http://code.google.com/p/sqldf/#12._How_does_one_use_sqldf_with_PostgreSQL?

-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com



More information about the R-help mailing list