[R] Very Slow Gower Similarity Function

Jari Oksanen jari.oksanen at oulu.fi
Mon Apr 18 18:58:29 CEST 2005


On 18 Apr 2005, at 19:10, Tyler Smith wrote:

> Hello,
>
> I am a relatively new user of R. I have written a basic function to 
> calculate
> the Gower similarity function. I was motivated to do so partly as an 
> excercise
> in learning R, and partly because the existing option (vegdist in the 
> vegan
> package) does not accept missing values.
>
Speed is the reason to use C instead of R. It should be easy, almost 
trivial, to modify the vegdist.c  so that it handles missing values. I 
guess this handling means ignoring the value pair if one of the values 
is missing -- which is not so gentle to the metric properties so dear 
to Gower. Package vegan is designed for ecological community data which 
generally do not have missing values (except in environmental data), 
but contributions are welcome.

> I think I have succeeded - my function gives me the correct values. 
> However, now
> that I'm starting to use it with real data, I realise it's very slow. 
> It takes
> more than 45 minutes on my Windows 98 machine (R 2.0.1 Patched 
> (2005-03-29))
> with a 185x32 matrix with ca 100 missing values. If anyone can suggest 
> ways to
> speed up my function I would appreciate it. I suspect having a pair of 
> nested
> for loops is the problem, but I couldn't figure out how to get rid of 
> them.

cheers, jari oksanen
--
Jari Oksanen, Oulu, Finland




More information about the R-help mailing list