[R] RJDBC vs RMySQL vs ???
James W. MacDonald
jmacdon at med.umich.edu
Wed Jun 23 22:36:29 CEST 2010
Ralf B wrote:
> I am running a simple SQL SELECT statement that involvs 50k + data
> points using R and the RJDBC interface. I am facing very slow response
> times in both the RGUI and the R console. When running this SQL
> statement directly in a SQL client I have processing times that are a
> lot lot faster (which means that the SQL statement itself is not the
> Did any of you compare RJDBC vs RMySQL or is there a better, more
> efficient way to extract large data from databases using R? Would you
> recommend dumping data out completely into flat files and working with
> flat files instead? I expected that this would not be such a problem
> given that businesses maintain their data in DBs and R is supposed to
> be good in shifting around data. Am I doing something wrong?
Well, if you don't show people what you have done, how can anybody tell
if you are doing something wrong or not?
I have no experience with RJDBC, so cannot say anything about that.
However, I have always found RMySQL to be speedy enough. As an example:
Loading required package: DBI
> con <- dbConnect("MySQL", host="genome-mysql.cse.ucsc.edu", user =
"genome", dbname = "hg18")
> system.time(a <- dbGetQuery(con, "select name, chromEnd from snp129
where chrom='chr1' and chromStart between 1 and 1e8;")
user system elapsed
7.95 0.06 38.59
 508676 2
So 40 seconds to get half a million records. Since this is via the
internet, I have to imagine things would be much faster querying a local DB.
But then you never say what constitutes 'slow' for you, so maybe this is
slow as well?
> R-help at r-project.org mailing list
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
James W. MacDonald, M.S.
University of Michigan
Department of Human Genetics
1241 E. Catherine St.
Ann Arbor MI 48109-5618
Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues
More information about the R-help