[Rd] Ordering of values returned by unique

Witold Eryk Wolski wolski at molgen.mpg.de
Wed Sep 29 18:24:28 CEST 2004


Hi!

Thanks for this explanation and pointing me to value section of the 
documention and the function duplicate. Indeed the documentation entry 
in the value section states:
An object of the same type of 'x'. but if an element is equal to one 
with a smaller index, it is removed. Could have found it by myself.

/E


Tony Plate wrote:

> AFAIK, it has always worked that way in S-plus and R.  Furthermore, 
> the documentation in R for 'unique' says that it removes duplicated 
> elements.  This does seem to leave the possibility that element other 
> than the first of a set of duplicates is retained, which could mess up 
> the order.  However, the documentation for 'duplicated' is clearer: it 
> says that 'duplicated' identifies duplicates of earlier elements.  
> Also in the examples for 'duplicated', it says that x[!duplicated(x)] 
> == unique(x) (paraphrased).
>
> I depend on this all the time, so I also checked some references.  In 
> the Blue book the documentation for the functions unique and 
> duplicated is combined and implies the above.  In MASS 4th Ed, the 
> page referred to by the index entry for 'unique' (p48, #9 in my copy) 
> states that 'unique' removes duplicates as identified by 'duplicated', 
> which implies that the order of retained elements is not changed.  The 
> Green book has no index entry for 'unique'.  In S-plus the 
> implementation of unique.default(x) uses x[!duplicated(x)].
>
> So, I think the evidence is pretty strong that unique(x) will always 
> return elements in the same order as they first appear in x.  But it 
> would be nice if the documentation for 'unique' explicitly stated that 
> this is the behavior for all methods.  (It does state this for the 
> array method for 'unique').
>
> -- Tony Plate
>
> At Wednesday 09:17 AM 9/29/2004, Witold Eryk Wolski wrote:
>
>> Hi,
>>
>> Is the ordering of the values returned something on what I can rely 
>> on, a form of a standard,  that a function called unique in R (in 
>> futher versions) will return the uniq elements in order of they first 
>> occurcence.
>>
>> > x<-c(2,2,1,2)
>> > unique(x)
>> [1] 2 1
>>
>> Its seems not to be the standard. E.g. matlab
>> >> x=[2,2,1,2]
>> x =
>>     2     2     1     2
>> >> unique(x)
>> ans =
>>     1     2
>>
>> I just noted it because, the way how it is working now is extremely 
>> usefull for some applications (e.g tree traversal), so i use it in a 
>> script. But I am a little woried if I can rely on this behaviour in 
>> further versions. And furthermore can I assume that someone reading 
>> the code will think that it works in that way?
>> Or is it better to define a additional function?
>> keeporderunique<-function(x)
>> {
>>    res<-rep(NA,length(unique(x))
>>    count<-0
>>    for(i in x)
>>    {
>>        if(!i%in%res)
>>            {
>>                    count<-count+1
>>                     res[count]<-i
>>            }
>>    }
>>    res
>> }
>>
>> /E
>>
>>
>>
>> -- 
>> Dipl. bio-chem. Witold Eryk Wolski
>> MPI-Moleculare Genetic
>> Ihnestrasse 63-73 14195 Berlin           _
>> tel: 0049-30-83875219                   'v'
>> http://www.molgen.mpg.de/~wolski       /   \
>> mail: witek96 at users.sourceforge.net  ---W-W----
>>      wolski at molgen.mpg.de
>>
>> ______________________________________________
>> R-devel at stat.math.ethz.ch mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>
>


-- 
Dipl. bio-chem. Witold Eryk Wolski         
MPI-Moleculare Genetic
Ihnestrasse 63-73 14195 Berlin           _
tel: 0049-30-83875219                   'v'
http://www.molgen.mpg.de/~wolski       /   \
mail: witek96 at users.sourceforge.net  ---W-W----
      wolski at molgen.mpg.de



More information about the R-devel mailing list