[R] Reading data into R
    Gabor Grothendieck 
    ggrothendieck at gmail.com
       
    Thu Jan  3 15:10:03 CET 2008
    
    
  
On Jan 3, 2008 9:00 AM, BEP <perronbe at gmail.com> wrote:
> Hello all,
>
> I am working with a very large data set into R, and I have no interest in
> reviving my SAS skills.  To do this, I will need to drop unwanted variables
> given the size of the data file.  The most common strategy seems to be
> subsetting the data after it is read into R.  Unfortunately, given the size
> of the data set, I can't get the file read and then subsquently do the
> subset procedure.  I would be appreciative of help on the following:
>
> 1.  What are the possibilities of reading in just a small set of variables
> during the <read.table> statement (or another 'read' statement)?  That is,
> is it possible specify just the variables that I want to keep?
read.table can skip columns.  Specify the releveant component of colClasses
as NULL.
>
> 2.  Can I randomly select a set of observations during the 'read' statement?
>
>
> I have searched various R resources for this information, so if I am simply
> overlooking a key resource on this issue, pointing that out to me would be
> greatly appreciated.
>
The development version of sqldf can do all of the above (i.e. read in
a subset of
columns, a subset of rows or a random subset of rows) subject to certain
limitations on the input format.  See Example 6 on the home page:
   http://sqldf.googlecode.com
readTable in the R.utils package can also read in a subset of rows and columns.
    
    
More information about the R-help
mailing list