[R] Testing if all elements are equal in a vector/matrix

Prof Brian Ripley ripley at stats.ox.ac.uk
Tue Jun 16 16:59:40 CEST 2009


On Tue, 16 Jun 2009, jim holtman wrote:

> I think the only way that you are going to get it to stop on the first
> mismatch is to write your own function in C if you are concerned about the
> time.  Matching on character vectors will be even more costly since it is
> having to loop to check the equality of each character in each element.
> This is one of the places it might pay to convert to factors and then the
> comparison only uses the integer values assigned to the factors.

Not so in a recent R: comparison of character vectors is now done by 
comparing pointers in the first instance so (at least on a 32-bit 
platform) is as fast as comparing integers.  And on x86_64 Linux:

> x <- as.character(c(1,2,rep(1,10000000)))
> system.time(print(all(x[1] == x)))
[1] FALSE
    user  system elapsed
   0.123   0.019   0.142

> system.time(xx <- as.factor(x))
    user  system elapsed
   9.874   0.284  10.159
> system.time(print(all(xx[1] == xx)))
[1] FALSE
    user  system elapsed
   0.511   0.145   0.656

Recent pre-release versions of R (e.g. 2.9.1 beta) allow

> system.time(anyDuplicated(x))
    user  system elapsed
   0.034   0.078   0.113
> system.time(anyDuplicated(xx))
    user  system elapsed
   0.037   0.076   0.113

which is probably what the original poster was looking for.

>
> On Tue, Jun 16, 2009 at 8:31 AM, utkarshsinghal <
> utkarsh.singhal at global-analytics.com> wrote:
>
>> Hi Jim,
>>
>> What you are saying is correct. Although, my computer might not have same
>> speed and I am getting the following for 10M entries:
>>
>>    user  system elapsed
>>   0.559   0.038   0.607
>>
>> Moreover, in the case of character vectors, it gets more than double.
>>
>> In my modeling, which is already highly time consuming,  I need to do check
>> this for few thousand vectors and the entries can easily be 10M in each
>> vector. So I am just looking for any possibilities of time saving.  I am
>> pretty sure that whenever elements are not all equal, it can be concluded
>> from any few entries (most of the times). It will be worth if I can find a
>> way which stops checking further the moment it find two distinct elements.
>>
>> Regards
>> Utkarsh
>>
>>
>>
>> jim holtman wrote:
>>
>> Just check that the first (or any other element) is equal to all the rest:
>>
>>> x = c(1,2,rep(1,10000000)) # 10,000,000
>>> system.time(print(all(x[1] == x)))
>> [1] FALSE
>>    user  system elapsed
>>    0.18    0.00    0.19
>>
>>>
>> This was for 10M entries.
>>
>> On Tue, Jun 16, 2009 at 7:42 AM, utkarshsinghal <
>> utkarsh.singhal at global-analytics.com> wrote:
>>
>>>
>>> Hi All,
>>>
>>> There are several replies to the question below, but I think there must
>>> exist a  better way of doing so.
>>> I just want to check whether all the elements of a vector are same. My
>>> vector has one million elements and it is highly likely that there are
>>> distinct elements in the first few itself. For example:
>>>
>>> > x = c(1,2,rep(1,100000))
>>>
>>> I want the answer as FALSE, which is clear from the first two
>>> observations itself and we don't need to check for the rest.
>>>
>>> Does anybody know the most efficient way of doing this?
>>>
>>> Regards
>>> Utkarsh
>>>
>>>
>>>
>>> From: Francisco J. Zagmutt <gerifalte28_at_hotmail.com
>>> <mailto:gerifalte28_at_hotmail.com
>>> ?Subject=Re:%20%5BR%5D%20Testing%20if%20all%20elements%20are%20equal%20in%20a%20vector/matrix>>
>>>
>>> Date: Tue 30 Aug 2005 - 06:05:20 EST
>>>
>>>
>>> Hi Doran
>>>
>>> The documentation for isTRUE reads 'isTRUE(x)' is an abbreviation of
>>> 'identical(TRUE,x)' so actually Vincent's solutions is "cleaner" than
>>> using identical :)
>>>
>>> Cheers
>>>
>>> Francisco
>>>
>>> />From: "Doran, Harold" <HDoran at air.org> /
>>> />To: <vincent.goulet at act.ulaval.ca>, <r-help at stat.math.ethz.ch> /
>>> />Subject: Re: [R] Testing if all elements are equal in a vector/matrix /
>>> />Date: Mon, 29 Aug 2005 15:49:20 -0400 /
>>> /> /
>>> >See ?identical
>>> <http://tolstoy.newcastle.edu.au/R/help/05/08/11201.html#11202qlink1>
>>> /> /
>>> />-----Original Message----- /
>>> />From: r-help-bounces at stat.math.ethz.ch /
>>> />[mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Vincent Goulet /
>>> />Sent: Monday, August 29, 2005 3:35 PM /
>>> />To: r-help at stat.math.ethz.ch /
>>> />Subject: [R] Testing if all elements are equal in a vector/matrix /
>>> /> /
>>> /> /
>>> />Is there a canonical way to check if all elements of a vector or
>>> matrix are /
>>> />the same? Solutions below work, but look hackish to me. /
>>> /> /
>>> /> > x <- rep(1, 10) /
>>> /> > all(x == x[1]) # == operator does not provide for small differences /
>>> */>[1] TRUE /
>>> */> > isTRUE(all.equal(x, rep(x[1], length(x)))) # ugly /
>>> */>[1] TRUE /
>>> */> /
>>> />Best, /
>>> /> /
>>> />Vincent /
>>> />-- /
>>> /> Vincent Goulet, Associate Professor /
>>> /> ?cole d'actuariat /
>>> /> Universit? Laval, Qu?bec /
>>> /> Vincent.Goulet_at_act.ulaval.ca<http://vincent.goulet_at_act.ulaval.ca/>
>>> <mailto:Vincent.Goulet_at_act.ulaval.ca
>>> ?Subject=Re:%20%5BR%5D%20Testing%20if%20all%20elements%20are%20equal%20in%20a%20vector/matrix>
>>> http://vgoulet.act.ulaval.ca /
>>> /> /
>>> />______________________________________________ /
>>> />R-help at stat.math.ethz.ch mailing list /
>>> />https://stat.ethz.ch/mailman/listinfo/r-help /
>>> />PLEASE do read the posting guide! /
>>> />http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>/
>>> /> /
>>> />______________________________________________ /
>>> />R-help at stat.math.ethz.ch mailing list /
>>> />https://stat.ethz.ch/mailman/listinfo/r-help /
>>> />PLEASE do read the posting guide! /
>>> />http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>/
>>>
>>>        [[alternative HTML version deleted]]
>>>
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>>
>>
>>
>> --
>> Jim Holtman
>> Cincinnati, OH
>> +1 513 646 9390
>>
>> What is the problem that you are trying to solve?
>>
>>
>>
>
>
> -- 
> Jim Holtman
> Cincinnati, OH
> +1 513 646 9390
>
> What is the problem that you are trying to solve?
>
> 	[[alternative HTML version deleted]]
>
>

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595




More information about the R-help mailing list