[R] reading in a subset of a large data set

jim holtman jholtman at gmail.com
Fri Jul 11 18:58:06 CEST 2008


If the data you want is contiguous, then just 'skip' the number of
records and then read the number you want.

If you want to select a random sample, then checkout
http://article.gmane.org/gmane.comp.lang.r.general/78318/match=random+read

In your case where you want to conditionally read based on values,
then you may have to read in a subset, select the records you want and
then continue reading the file.  At then end, you can reconstruct the
data into a single dataframe.`

On Fri, Jul 11, 2008 at 12:25 PM, Stacey Burrows
<stacey.burrows at yahoo.ca> wrote:
> I have a huge dataset for which I only want to read in a subset of it. Is it possible to use read.table to read in only a subset of the data? For example, something like read.table('~/data.txt', subset = chromosome=='1' )
>
> If not, then why not? This seems to be a feature available in all other statistical software.
>
> Thanks,
> Stacey
>
>
>
>      __________________________________________________________________
> [[elided Yahoo spam]]
>        [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?



More information about the R-help mailing list