[R] Select

Jeff Newmiller jdnewm|| @end|ng |rom dcn@d@v|@@c@@u@
Tue Feb 12 01:26:17 CET 2019


N <- 8 # however many times you want to do this
ans <- lapply( seq.int( N )
              , function( n ) {
                  idx <- sample( nrow( mydat ) )
                  mydat[ idx[ seq.int( which( 40 < cumsum( mydat[ idx, "count" ] ) )[ 1 ] ) ], ]
                }
              )


On Mon, 11 Feb 2019, Val wrote:

> Sorry Jeff and David  for not being clear!
>
> The total sample size should be at least 40, but the selection should
> be based on group ID.  A different combination of Group ID could give
> at least  40.
> If I select  group G1   with 25  count and  G2  and with 15  counts
> then   I can get  a minimum of 40  counts.   So G1 and G2 are
> selected.
> G1  25
> G2  15
>
> In another scenario, if G2, G3 and G4  are  selected  then the total
> count will be 58 which is  greater than 40. So G2 , G3 and G4  could
> be selected.
> G2 15
> G3 12
> G4 31
>
> So the restriction is to  find group IDs  that give a minim of  40.
> Once, I reached a minim of 40 then stop selecting group  and output
> the data..
>
> I am hope this helps
>
>
>
>
> On Mon, Feb 11, 2019 at 5:09 PM Jeff Newmiller <jdnewmil using dcn.davis.ca.us> wrote:
>>
>> This constraint was not clear in your original sample data set. Can you expand the data set to clarify how this requirement REALLY works?
>>
>> On February 11, 2019 3:00:15 PM PST, Val <valkremk using gmail.com> wrote:
>>> Thank you David.
>>>
>>> However, this will not work for me. If the group ID selected then all
>>> of its observation should be included.
>>>
>>> On Mon, Feb 11, 2019 at 4:51 PM David L Carlson <dcarlson using tamu.edu>
>>> wrote:
>>>>
>>>> First expand your data frame into a vector where G1 is repeated 25
>>> times, G2 is repeated 15 times, etc. Then draw random samples of 40
>>> from that vector:
>>>>
>>>>> grp <- rep(mydat$group, mydat$count)
>>>>> grp.sam <- sample(grp, 40)
>>>>> table(grp.sam)
>>>> grp.sam
>>>> G1 G2 G3 G4 G5
>>>> 10  9  5 13  3
>>>>
>>>> ----------------------------------------
>>>> David L Carlson
>>>> Department of Anthropology
>>>> Texas A&M University
>>>> College Station, TX 77843-4352
>>>>
>>>>
>>>> -----Original Message-----
>>>> From: R-help <r-help-bounces using r-project.org> On Behalf Of Val
>>>> Sent: Monday, February 11, 2019 4:36 PM
>>>> To: r-help using R-project.org (r-help using r-project.org)
>>> <r-help using r-project.org>
>>>> Subject: [R] Select
>>>>
>>>> Hi all,
>>>>
>>>> I have a data frame  with tow variables  group and its size.
>>>> mydat<- read.table( text='group  count
>>>> G1 25
>>>> G2 15
>>>> G3 12
>>>> G4 31
>>>> G5 10' , header = TRUE, as.is = TRUE )
>>>>
>>>> I want to select   group ID randomly (without replacement)  until
>>> the
>>>> sum of count reaches 40.
>>>> So, in  the first case, the data frame could be
>>>>    G4 31
>>>>    65 10
>>>>
>>>> In other case, it could be
>>>>   G5 10
>>>>   G2 15
>>>>   G3 12
>>>>
>>>> How do I put sum of count variable   is  a minimum of 40 restriction?
>>>>
>>>> Than k you in advance
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> I want to select group  ids randomly until I reach the
>>>>
>>>> ______________________________________________
>>>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>> ______________________________________________
>>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>
>> --
>> Sent from my phone. Please excuse my brevity.
>

---------------------------------------------------------------------------
Jeff Newmiller                        The     .....       .....  Go Live...
DCN:<jdnewmil using dcn.davis.ca.us>        Basics: ##.#.       ##.#.  Live Go...
                                       Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/Batteries            O.O#.       #.O#.  with
/Software/Embedded Controllers)               .OO#.       .OO#.  rocks...1k



More information about the R-help mailing list