[R] Transport and Earth Mover's Distance

Schuhmacher, Dominic dominic.schuhmacher at mathematik.uni-goettingen.de
Thu Mar 9 17:44:27 CET 2017

> Am 08.03.2017 um 11:28 schrieb Schuhmacher, Dominic <dominic.schuhmacher at mathematik.uni-goettingen.de>:
> ...
>>> If you have no particular need for binning, check out the function
>>> pppdist in the R-package spatstat, which offers a more flexible way
>>> to deal with point patterns of different size.
>> Well, this is not clear, but possibly very important for me.
>> My raw data consists of 2 univariate samples of unequal length.
>> suppose that
>> x<-rnorm(100)
>> and
>> y<-rnorm(90)
>> Is there a way to define the Wasserstein distance between them without
>> going through the binning procedure?
> Define, yes: the 1-Wasserstein distance in one-dimension is the area between the empirical cumulative distribution functions. If the samples had the same lengths this could be directly computed by
> mean(abs(sort(x)-sort(y)))
> Otherwise this needs some lines of code. I will include it in the next version of the transport package (soon).
> Best regards,
> Dominic
Following up on this earlier post: transport 0.8-2, which is on CRAN now, offers the possibility to compute the Wasserstein distance between univariate samples of differing lengths (more precisely their empirical distributions).

x <- rnorm(100)
y <- rnorm(90)

Cheers, Dominic

Dominic Schuhmacher
Professor of Stochastics
University of Goettingen

More information about the R-help mailing list