Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parallel matching #330

Closed
wants to merge 17 commits into from
Closed

Conversation

szymon-zygula
Copy link

This PR adds a parallel matching algorithm using relaxed memory ordering atomic operations in the UnionFind, allowing for parallel path compression. Theoretically, this is overhead-free on x86 (because of total store order), other than some cache coherence magic. Some people may remember it from conversations at PLDI 2024. This makes parallel matching scale well with the number of threads.

The functionality is hidden behind parallel-matching feature. When the feature is activated, you can set parallel_matching to true in Runner to use it.

Atomics in the UnionFind are always used, regardless of the parallel-matching feature. There could be two implementations, the new one and the old one, depending on parallel-matching, although in my opinion it would add unnecessary maintenance overhead. Opinions on this topic are welcome.

Copy link
Member

@mwillsey mwillsey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, thanks for the contribution!

I like the idea of merging this, but I'm not willing to accept any breaking changes for it (when you aren't using the feature). I also think that the unionfind change isn't necessary to parallel matching; you should just be able to us the existing one.

Thoughts?

src/unionfind.rs Outdated
@@ -1,50 +1,66 @@
use crate::Id;
use crate::{AtomicId, Id};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need any changes to the unionfind to enable parallel matching.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not strictly necessary. Just to check I did some more benchmarking, and although the parallel path compression does help in some scenarios, the gains are not too big (under 10%), so using current UnionFind should not affect performance. I'm reversing this change then.

src/lib.rs Outdated Show resolved Hide resolved
src/rewrite.rs Outdated Show resolved Hide resolved
src/language.rs Outdated
@@ -28,11 +28,11 @@ use thiserror::Error;
///
/// See [`SymbolLang`] for quick-and-dirty use cases.
#[allow(clippy::len_without_is_empty)]
pub trait Language: Debug + Clone + Eq + Ord + Hash {
pub trait Language: Debug + Clone + Eq + Ord + Hash + Send + Sync {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it's acceptable to change the bounds of Language at this point in egg's lifecycle. Getting rid of this I think is the main challenge the merging this.

Copy link
Author

@szymon-zygula szymon-zygula Aug 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a new trait MaybePar, which is

  • automatically implemented for every type when parallel-matching is not set,
  • automatically implemented for Sync + Send types when parallel-matching is not set.

Language, Analysis, Analysis::Data, and some other traits now have bounds involving this trait, so nothing changes when parallel-matching isn't set. Let me know what you think about this solution. If trait aliases ever get stabilized, they could be used instead.
Another solution could be creating a separate hierarchy of traits (ParLanguage, ParAnalysis, etc.), but this creates a lot of boilerplate code, and after trying this out I really don't think it's a good idea.

@szymon-zygula szymon-zygula requested a review from mwillsey August 30, 2024 11:01
@mwillsey
Copy link
Member

I think this is a cool contribution, but I will close it for now in favor of this commit I just added: d014800.

It should be quite easy to add (rule-based) parallelism on the client side without any further changes to egg. I have documented an example of doing so.

@mwillsey mwillsey closed this Aug 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants