Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Distance measure used by phom.dist() #26

Closed
sarahsamorodnitsky opened this issue Mar 21, 2024 · 3 comments
Closed

Distance measure used by phom.dist() #26

sarahsamorodnitsky opened this issue Mar 21, 2024 · 3 comments

Comments

@sarahsamorodnitsky
Copy link

Hello! I am using the phom.dist() function to compute the distance between persistence diagrams. Can you clarify what distance measure is being computed by this function? Is there a reference/citation/source for the distance measure being computed? I was under the impression phom.dist() returned the Wasserstein distance based on the function naming, but looking at a previous issue (#13) I see that that isn't the case.

Thanks!

@corybrunson
Copy link
Collaborator

corybrunson commented Mar 23, 2024

Digging into the code in 'inference.R', it looks like the distance is the sum a vector of the exponentiated absolute differences between the sorted feature lifespans (rather than birth–death coordinates) within each dimension:

$$D_q(X,Y)[d] = \sum_{k=1}^{n_d} \lvert (\ell(x_k)) - (\ell(y_k)) \rvert ^ q$$

where $d$ ranges over dimensions, $n_d$ is the maximum number of $d$-dimensional features of $X$ and $Y$, $\ell(x)$ is the lifespan of feature $x$, and the features $x_k$ and $y_k$ are in descending order of lifespan.

@rrrlw may want to chime in. I'm not sure when the package might be upgraded, but certainly clarifying this, and hopefully providing the Wasserstein distance, will be part of that.

@sarahsamorodnitsky
Copy link
Author

Any intuition as to why this distance is recommended/used over the p-Wasserstein distance to compare persistence diagrams? I haven't seen this distance measure in the literature, though my literature search has not been exhaustive.

Thanks again!

@corybrunson
Copy link
Collaborator

I didn't contribute to it and i don't know a reference for it. My intuition is that it's much less complicated and expensive, though it would certainly be good to provide an explicit rationale.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants