[R] How to randomly extract a number of rows in a data frame

William Dunlap wdunlap at tibco.com
Fri Aug 1 21:12:51 CEST 2014


Do you know how to extract some rows of a data.frame?  A short answer
is with subscripts, either integer,
   first10 <- 1:10
   dFirst10 <- d[first10, ] # I assume your data.frame is called 'd'
or logical
   plus4 <- d[, "Col_4"] == "+"
   dPlus4 <- d[ plus4, ]
If you are not familiar with that sort of thing, read the introduction
to R document that comes with R.

So you can solve your problem if you can generate a vector containing
1 million integers in the range 1:10^7.  Use the sample function for
that.  You must decide if you want to allow duplicate rows or not
(i.e., sampling with or without replacement). Type
  ?sample
to see the details.


Bill Dunlap
TIBCO Software
wdunlap tibco.com


On Fri, Aug 1, 2014 at 11:58 AM, Stephen HK Wong <honkit at stanford.edu> wrote:
> Dear ALL,
>
> I have a dataframe contains 4 columns and several 10 millions of rows like below! I want to extract out "randomly" say 1 millions of rows, can you tell me how to do that in R using base packages? Many Thanks!!!!
>
> Col_1   Col_2   Col_3   Col_4
> chr1    3000215 3000250 -
> chr1    3000909 3000944 +
> chr1    3001025 3001060 +
> chr1    3001547 3001582 +
> chr1    3002254 3002289 +
> chr1    3002324 3002359 -
> chr1    3002833 3002868 -
> chr1    3004565 3004600 -
> chr1    3004945 3004980 +
> chr1    3004974 3005009 -
> chr1    3005115 3005150 +
> chr1    3005124 3005159 +
> chr1    3005240 3005275 -
> chr1    3005558 3005593 -
> chr1    3005890 3005925 +
> chr1    3005929 3005964 +
> chr1    3005913 3005948 -
> chr1    3005913 3005948 -
>
> Stephen HK Wong
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list