[R] Suggestion for big files [was: Re: A comment about R:]

hadley wickham h.wickham at gmail.com
Fri Jan 6 08:28:27 CET 2006


> Selecting a sample is easy.  Yet, I'm not aware of any SQL device for
> easily selecting a _random_ sample of the records of a given table.  On
> the other hand, I'm no SQL specialist, others might know better.

There are a number of such devices, which tend to be rather SQL
variant specific.  Try googling for select random rows mysql, select
random rows pgsql, etc.

Another possibility is to generate a large table of randomly
distributed ids and then use that (with randomly generated limits) to
select the appropriate number of records.

Hadley




More information about the R-help mailing list