[R] Select values at random by id value

James Martin just.struttin at gmail.com
Thu Jul 2 15:15:27 CEST 2009


Hadley, Sunil, and list,

This is not quite doing what I wanted it to do (as far as I can tell). I
perhaps did not explain it thoroughly.  It seems to be sampling one value
for each day leaving ~200 observations. I need for it randomly chose one hab
value for each bird if there is more than one value for a given day, I will
try and example below.

id,date,location2,hab

1,05/23/06,0,1
1,05/23/06,0,2
1,05/23/06,0,1

So in this case the animal was located 3 times on may 23rd but I only want
one of the locations and instead of arbitrarily choosing one I wanted to
randomly sample one.

I hope I did a better job explaining my issue. Thanks in advance.

jm

On Wed, Jul 1, 2009 at 3:38 PM, hadley wickham <h.wickham at gmail.com> wrote:

> On Wed, Jul 1, 2009 at 2:10 PM, Sunil
> Suchindran<sunilsuchindran at gmail.com> wrote:
> > #Highlight the text below (without the header)
> > # read the data in from clipboard
> >
> > df <- do.call(data.frame, scan("clipboard", what=list(id=0,
> > date="",loctype=0 ,haptype=0)))
> >
> > # split the data by date, sample 1 observation from each split, and rbind
> >
> > sampled_df <- do.call(rbind, lapply(split(df,
> > df$date),function(x)x[sample(1:nrow(x), 1),]))
>
> ddply from the plyr package (http://had.co.nz/plyr), makes this sort
> of operation a little simpler:
>
> ddply(df, "date", function(df) df[sample(nrow(df), 1), ])
>
> Hadley
>
>
> --
> http://had.co.nz/
>



-- 
James A. Martin
850-445-9773


More information about the R-help mailing list