[R] question R regarding consecutive numbers

David Winsemius dwinsemius at comcast.net
Fri Oct 28 17:52:03 CEST 2011


On Oct 28, 2011, at 11:26 AM, William Dunlap wrote:

> Are you looking for something like the following?  ifactor()
> is like factor but assumes that x is integral and that levels
> should be {1, 2, ..., max(x)} with no gaps.
>
>> x <- c(1,3,4,9,1,9,1,5,4,5,2,1,1,1,6)
>> ifactor <- function(x, levels=seq_len(max(0, x, na.rm=TRUE)))  
>> factor(x, levels=levels)
>> with(rle(x), table(ifactor(values), ifactor(lengths)))
>
>    1 2 3
>  1 3 0 1
>  2 1 0 0
>  3 1 0 0
>  4 2 0 0
>  5 2 0 0
>  6 1 0 0
>  7 0 0 0
>  8 0 0 0
>  9 2 0 0
>
> Also note that tbl["2","3"] does not mean the same as tbl[2,3],
> although if you use the ifactor function as above they will refer
> to the same element.
>
> Bill Dunlap
> Spotfire, TIBCO Software
> wdunlap tibco.com

I answered earlier in a posting that did not get threaded properly in  
my mailer and this is  a précis:

 > xtbl <- with(rle(x), table(values, lengths))
 > xtbl["1", which( as.numeric( attr(xtbl, "dimnames")$lengths) >=2 )]
3 5
1 1

Since character-numeric comparisons appear to get properly coerced the  
as numeric may not be necessary:

 > xtbl["1", which( attr(xtbl, "dimnames")$lengths >=2 )]
3 5
1 1

-- 
David.
>
>> -----Original Message-----
>> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org 
>> ] On Behalf Of Samir Benzerfa
>> Sent: Friday, October 28, 2011 12:35 AM
>> To: 'David Winsemius'; 'Duncan Murdoch'
>> Cc: r-help at r-project.org
>> Subject: Re: [R] question R regarding consecutive numbers
>>
>> In the general case, there is still a gap in your solution  
>> >sum( tbl["1",
>> 2:ncol(tbl)] ). This solution refers to a specific column number  
>> (here:
>> column number 2) and not to the actual length of the run, doesn't  
>> it? That
>> is, in this simple example the column number 2 actually corresponds  
>> to the
>> length "2", but this must not be the case in general. For instance  
>> if there
>> is no run of length "2" but only of length "1" and "3", the column  
>> number 2
>> will refer to length "3" (try it with the new vector below). I  
>> realized this
>> problem when applying your solution to a much more extended vector.  
>> So, the
>> problem is that I would have to check manually whether the column  
>> number
>> really corresponds to the length of runs. A possible solution would  
>> be to
>> force R to show all the lengths from 1:ncol even if there is no run  
>> of some
>> lengths in-between and just fill the whole column with zero's.
>>
>>> x=c(1,3,4,9,1,9,1,5,4,5,2,1,1,1,6)
>>
>> Any ideas how to solve this problem?
>>
>> Cheers, S.B.
>>
>>
>> -----Ursprüngliche Nachricht-----
>> Von: David Winsemius [mailto:dwinsemius at comcast.net]
>> Gesendet: Donnerstag, 27. Oktober 2011 16:44
>> An: Duncan Murdoch
>> Cc: Samir Benzerfa; r-help at r-project.org
>> Betreff: Re: [R] question R regarding consecutive numbers
>>
>>
>> On Oct 27, 2011, at 9:21 AM, Duncan Murdoch wrote:
>>
>>> On 27/10/2011 8:43 AM, Samir Benzerfa wrote:
>>>> Hi  everyone
>>>>
>>>>
>>>>
>>>> Do you know about any possibility in R to check for consecutive
>>>> numbers in
>>>> vectors? That is, I do not only want to count the number of
>>>> observations in
>>>> total (which can be done table(x)), but I also want to count for
>>>> instance
>>>> how many times that vector contains a certain number consecutively.
>>>>
>>>>
>>>>
>>>> For example in the following vector x the number "1" appears 7  
>>>> times.
>>>> However, I want to check for instance how many times two
>>>> consecutive 1's
>>>> appear in the vector, which would actually be two times the case in
>>>> the
>>>> below vector.
>>>>
>>>>
>>>>
>>>>> x=c(1,1,3,4,9,1,9,1,5,4,5,2,1,1,1,6)
>>>>
>>>>
>>>>
>>>> Any ideas for this issue?
>>>
>>> How about this?
>>>
>>>> runs <- rle(x)
>>>> with(runs, table(values, lengths))
>>
>> And to go even a bit further, the table function returns a matrix
>> which can be addressed to yield the specific answer requested:
>>
>>  with(runs, table(values, lengths))["1",2]
>> [1] 1  # m=number of exactly runs if length 2
>>> sum( tbl["1", 2:ncol(tbl)] )
>> [1] 2  # number of runs of length two or more.
>>
>>
>> --
>> David
>>
>>>
>>
>>>    lengths
>>> values 1 2 3
>>>    1 2 1 1
>>>    2 1 0 0
>>>    3 1 0 0
>>>    4 2 0 0
>>>    5 2 0 0
>>>    6 1 0 0
>>>    9 2 0 0
>>>
>>> Duncan
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>
>> David Winsemius, MD
>> West Hartford, CT
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
West Hartford, CT



More information about the R-help mailing list