[R] unique with tolerance

Bert Gunter gunter.berton at gene.com
Thu Sep 6 19:24:12 CEST 2012


... and if it Duncan's suggestion won't do, maybe approaching it via
clustering might be useful.
But do note that, as stated, the problem is not well defined, because
transitivity fails: consider

v <- c(1,2,3,4,5,10)

with a tolerance of <=2. Then 1 is the same as 2 and 3, 2 and 3 are
the same as 4, but 1 is not the same as 4, etc. Exactly what would you
choose as the "unique" values with this tolerance in this situation?

-- Bert

On Thu, Sep 6, 2012 at 9:47 AM, Duncan Murdoch <murdoch.duncan at gmail.com> wrote:
> On 06/09/2012 6:48 AM, Michael Bach wrote:
>>
>> Dear R Users and Developers,
>>
>> I am trying to do the equivalent of
>>
>> v <- c(1,2,3,3,2,1,)
>> vu <- unique(v)
>>
>> for a vector such as
>>
>> v2 <- c(1.02, 2.03, 1.00, 3.04, 3.06)
>> vut <- ...
>>
>> As indicated in the subject, we need approximately unique values with a
>> defined
>> tolerance, i.e. for the v2 vector the resulting vut vector using a
>> tolerance of
>> .1 should return e.g.
>>
>> [1] 1.02 2.03 3.06
>>
>> Also, mean/min values instead of max could be returned.
>>
>> My actual question: Is there a convenience function or other mechanism
>> already
>> implemented that could do something similar?
>
>
> It might be enough to round your values before checking.  For the example,
>
> dups <- duplicated( round(v2) )
> v2[!dups]
>
> (This gives 3.04 rather than 3.06; I don't know if you care.)
>
> Duncan Murdoch
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



-- 

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm




More information about the R-help mailing list