[R] How to union the elements in a list?

Martin Morgan mtmorgan at fhcrc.org
Wed Oct 28 21:01:56 CET 2009


Bert Gunter <gunter.berton at gene.com> writes:

> ... and just for amusement: unique(do.call(c,l))
>
> The do.call and unlist approaches should be faster than Reduce; do.call
> _may_ be marginally faster than unlist. Here's a timing comparison:

For large named lists, unlist(l, use.names=FALSE) can have important
performance consequences. individual names are created for each
element, then immediately discarded --

  > unlist(list(a=1:3, b=1:4))
a1 a2 a3 b1 b2 b3 b4 
 1  2  3  1  2  3  4 
> unique(unlist(list(a=1:3, b=1:4)))
[1] 1 2 3 4


>
>> z <- split(sample(1000,1e6,rep=TRUE),rep(1:1e5,10))
>> length(z)
> [1] 100000
>
> ## the comparisons:
>
>> system.time(y1 <- Reduce(union,z))
>    user  system elapsed 
>    5.02    0.00    5.03 
>
>> system.time(y2 <- unique(unlist(z)))
>    user  system elapsed 
>    1.92    0.00    1.92 
>
>> system.time(y3 <- unique(do.call(c,z)))
>    user  system elapsed 
>    1.75    0.00    1.75  
>
>> identical(y1,y2)
> [1] TRUE
>> identical(y2,y3)
> [1] TRUE
>
> Obviously, this is unlikely to matter for any reasonable size dataset, but
> maybe it's instructive. 
>
> Of course, Reduce wins the RGolf contest  ;-)
>
> Bert Gunter
> Genentech Nonclinical Biostatistics
>  
>  
> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On
> Behalf Of Ben Bolker
> Sent: Wednesday, October 28, 2009 12:27 PM
> To: r-help at r-project.org
> Subject: Re: [R] How to union the elements in a list?
>
>
>
>
> Peng Yu wrote:
>> 
>> Suppose that I have a list of vectors. I want to compute the union of
>> all the vectors in the list. I could use 'for' loop to do so. But I'm
>> wondering what would be a better solution that does not need a 'for'
>> loop.
>> 
>> l=list(a=c(1,3,4), b=c(1,3,6), c=c(1,3,7), ....)
>> 
>> 
>
> Reduce(union,l)

-- 
Martin Morgan
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M1 B861
Phone: (206) 667-2793




More information about the R-help mailing list