[R] R & MySQL (Databases)

Santosh Srinivas santosh.srinivas at gmail.com
Sat Oct 16 06:21:50 CEST 2010


Dear R-helpers,

Considering that a substantial part of analysis is related data
manipulation, I'm just wondering if I should do the basic data part in a
database server (currently I have the data in .txt file).
For this purpose, I am planning to use MySQL. Is MySQL a good way to go
about? Are there any anticipated problems that I need to be aware of?

Considering, that many users here use large datasets. Do you typical store
the data in databases and query relevant portions for your analysis?
Does it speed up the entire process? Is it neater to do things in a
database? (for e.g. errors could corrected at data import stage itself by
conditions in defining the data itself in the database as opposed to
discovering things when you do the analysis in R and realize something is
wrong in the output?)

This is vis-à-vis using the built in SQLLite, indexing, etc capabilities in
R itself? Does performance work better with a database backend (especially
for simple but large datasets)?

The financial applications that I am thinking of are not exactly realtime
but quick response and fast performance would definitely help.

Aside info, I want to take things to a cloud environment at some point of
time just because it will be easier and cheaper to deliver.

Kind of an open question, but any inputs will help.



More information about the R-help mailing list