AW: [R] How to sample database

Daniel Hoppe daniel.hoppe at em.uni-karlsruhe.de
Wed Nov 13 19:45:27 CET 2002


Hi Rio,

what is the point where you need support, accessing the database or getting
the random sample? For picking the random samples, something like
"ceiling(runif(1, 0, rows))" might do to get the row-id.

This snipped here loads some rows from a MySQL database (if some R-guru
feels that this snippet is poor/nonsense/... please let me know, eager for
learning more ;-)):

pnbd.cohort.loadFromDatabase <- function(periods, description)
{
  mgr <- dbDriver("MySQL")
  con <- dbConnect(mgr, dbname="clv")
  customerCount <- pnbd.cohort.size(description)
  if (customerCount == 0)
  {
        stop("cohort not found!")
  }
  aggregation <- matrix(0, nrow=periods ,ncol=customerCount)
  for (i in 1:periods)
  {
    rs <- dbSendQuery(con, paste(sep="",
    "select orderCount from pnbd_aggregation where period = ", i, " and
description=\"", description, "\""))
    purchasesPerCustomerFrame <- fetch(rs, n=-1)
    aggregation[i, 1:customerCount] <- purchasesPerCustomerFrame[[1]]
    dbClearResult(rs)
  }
  dbDisconnect(con)

  cohort <- list(aggregation = aggregation, description = description)
  class(cohort) <- "pnbd.cohort"
  cohort
}

The relevant package in this case is DBI and the package supporting your
database, in my case RMySQL, but there are others as well. If memory is not
an issue, you could load all your 2400 records and then pick a sample. If
that's not possible and your records are numbered or your database supports
some kind of row-id concept you could generate the row-ids and then hit the
database once for each row-id (although that might well be slower than
loading all 2400 records with one statement).

Best Regards,

Daniel


-----Ursprungliche Nachricht-----
Von: owner-r-help at stat.math.ethz.ch
[mailto:owner-r-help at stat.math.ethz.ch]Im Auftrag von Bernardo Rangel
Tura
Gesendet: Mittwoch, 13. November 2002 18:04
An: r-help at stat.math.ethz.ch
Betreff: [R] How to sample database




Dear R-masters!

I have a database with 2400 patients and I need make a sample with 90
subjects.
How to I make a random sample in this data base with R?

Thanks in advance

Bernardo Rangel Tura, MD, MSc
National Institute of Cardiology Laranjeiras
Rio de Janeiro Brazil

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.
-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._.
_._

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._



More information about the R-help mailing list