0

The scipy.stats.wasserstein_distance function only returns the minimum distance (the solution) between two input distributions, p and q. But that distance is the result of the product of a distance matrix and an optimal transport matrix that must have been computed inside the same function.

How can I extract the distance matrix and optimal transport matrix that correspond to the solution as 2nd and 3rd output arguments?

develarist
  • 1,224
  • 1
  • 13
  • 34

1 Answers1

0

It does not seem that you can get the calculated transport matrix from scipy's wasserstein_distance. You can get it via other packages though, like https://github.com/wmayner/pyemd. I have been using this package for a while and it works pretty fine, while also executing very quickly. Look into the function emd_with_flow() within section Usage.

Then the distance matrix is an input of the EMD calculation, not an output.

  • 1
    I'm also aware of another package, `pot`, but the question here is about scipy, which I know is intended for 1D source and target distributions, which is all my application needs. Looking at the scipy code, I found that the `wasserstein_distance` calls another, more general, function which doesn't even calculate the transport matrix https://stackoverflow.com/questions/65131513/transport-matrix-is-missing-in-the-code-behind-scipy-stats-wasserstein-distance – develarist Dec 03 '20 at 21:01