[R] Sampling of non-overlapping intervals of variable length

Hadassa Brunschwig hadassa.brunschwig at mail.huji.ac.il
Mon Jul 20 07:08:12 CEST 2009


Thanks Chuck.

Ups, did not think of the problem in that way.
That did exactly what I needed.  I have another complication to this problem:
I do not only have one vector of 1:1e^6 but several vectors of
different length, say 5.
Initially, my intervals are distributed over those 5 vectors and the
ranges of those
5 vectors in a specific way (and you might have guessed by now that I would like
to do something like a permutation test). Because I have this
additional level, I guess
I could do something like:

1)Sample the 5 vectors with probabilities proportional to the
frequencies of the intial
intervals on these vectors.
2)For each sampled vector: apply Chucks solution.

?
Thanks a lot.
Hadassa

On Sun, Jul 19, 2009 at 11:13 PM, Charles C. Berry<cberry at tajo.ucsd.edu> wrote:
> On Sun, 19 Jul 2009, Hadassa Brunschwig wrote:
>
>> Hi
>>
>> I am not sure what you mean by sampling an index of a group of
>> intervals. I will try to give an example:
>> Let's assume I have a vector 1:1000000. Let's say I have 10 intervals
>> of different but known length, say,
>> c(4,6,11,2,8,14,7,2,18,32). For simulation purposes I have to sample
>> those 10 intervals 1000 times.
>> The requirement is, however, that they should be of those lengths and
>> should not be overlapping.
>> In short, I would like to obtain a 10x1000 matrix with sampled intervals.
>
> Something like this:
>
>
>> lens <- c(4,6,11,2,8,14,7,2,18,32)
>> perm.lens <- sample(lens)
>>
>> sort(sample(1e06-sum(lens)+length(lens),length(lens)))+cumsum(c(0,head(perm.lens,-1)))
>
>  [1]  15424 261927 430276 445976 451069 546578 656123 890494 939714 969643
>>
>
> The vector above gives the starting points for the intervals whose lengths
> are perm.lens.
>
> I'd bet every introductory combinatorics book has a problem or example in
> which the expression for the number of ways in which K ordered objects can
> be assigned to I groups consisting of n_i adjacent objects each is
> constructed. The construction is along the lines of the calculation above.
>
> HTH,
>
> Chuck
>
>
>>
>> Thanks
>> Hadassa
>>
>> On Sun, Jul 19, 2009 at 9:48 PM, David Winsemius<dwinsemius at comcast.net>
>> wrote:
>>>
>>> On Jul 19, 2009, at 1:05 PM, Hadassa Brunschwig wrote:
>>>
>>>> Hi,
>>>>
>>>> I hope I am not repeating a question which has been posed already.
>>>> I am trying to do the following in the most efficient way:
>>>> I would like to sample from a finite (large) set of integers n
>>>> non-overlapping
>>>> intervals, where each interval i has a different, set length L_i
>>>> (which is the number
>>>> of integers in the interval).
>>>> I had the idea to sample recursively on a vector with the already
>>>> chosen intervals
>>>> discarded but that seems to be too complicated.
>>>
>>> It might be ridiculously easy if you sampled on an index of a group of
>>> intervals.
>>> Why not pose the question in the form of example data.frames or other
>>> classes of R objects? Specification of the desired output would be
>>> essential. I think further specification of the sampling strategy would
>>> also
>>> help because I am unable to understand what sort of probability model you
>>> are hoping to apply.
>>>
>>>> Any suggestions on that?
>>>>
>>>> Thanks a lot.
>>>>
>>>> Hadassa
>>>>
>>>>
>>>> --
>>>> Hadassa Brunschwig
>>>> PhD Student
>>>> Department of Statistics
>>>
>>>
>>> David Winsemius, MD
>>> Heritage Laboratories
>>> West Hartford, CT
>>>
>>>
>>
>>
>>
>> --
>> Hadassa Brunschwig
>> PhD Student
>> Department of Statistics
>> The Hebrew University of Jerusalem
>> http://www.stat.huji.ac.il
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
> Charles C. Berry                            (858) 534-2098
>                                            Dept of Family/Preventive
> Medicine
> E mailto:cberry at tajo.ucsd.edu               UC San Diego
> http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901
>
>
>



-- 
Hadassa Brunschwig
PhD Student
Department of Statistics
The Hebrew University of Jerusalem
http://www.stat.huji.ac.il




More information about the R-help mailing list