-
Notifications
You must be signed in to change notification settings - Fork 142
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parallel matching #330
Parallel matching #330
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, thanks for the contribution!
I like the idea of merging this, but I'm not willing to accept any breaking changes for it (when you aren't using the feature). I also think that the unionfind change isn't necessary to parallel matching; you should just be able to us the existing one.
Thoughts?
src/unionfind.rs
Outdated
@@ -1,50 +1,66 @@ | |||
use crate::Id; | |||
use crate::{AtomicId, Id}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need any changes to the unionfind to enable parallel matching.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not strictly necessary. Just to check I did some more benchmarking, and although the parallel path compression does help in some scenarios, the gains are not too big (under 10%), so using current UnionFind
should not affect performance. I'm reversing this change then.
src/language.rs
Outdated
@@ -28,11 +28,11 @@ use thiserror::Error; | |||
/// | |||
/// See [`SymbolLang`] for quick-and-dirty use cases. | |||
#[allow(clippy::len_without_is_empty)] | |||
pub trait Language: Debug + Clone + Eq + Ord + Hash { | |||
pub trait Language: Debug + Clone + Eq + Ord + Hash + Send + Sync { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it's acceptable to change the bounds of Language
at this point in egg
's lifecycle. Getting rid of this I think is the main challenge the merging this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a new trait MaybePar
, which is
- automatically implemented for every type when
parallel-matching
is not set, - automatically implemented for
Sync + Send
types whenparallel-matching
is not set.
Language
, Analysis
, Analysis::Data
, and some other traits now have bounds involving this trait, so nothing changes when parallel-matching
isn't set. Let me know what you think about this solution. If trait aliases ever get stabilized, they could be used instead.
Another solution could be creating a separate hierarchy of traits (ParLanguage
, ParAnalysis
, etc.), but this creates a lot of boilerplate code, and after trying this out I really don't think it's a good idea.
I think this is a cool contribution, but I will close it for now in favor of this commit I just added: d014800. It should be quite easy to add (rule-based) parallelism on the client side without any further changes to egg. I have documented an example of doing so. |
This PR adds a parallel matching algorithm using relaxed memory ordering atomic operations in the
UnionFind
, allowing for parallel path compression. Theoretically, this is overhead-free on x86 (because of total store order), other than some cache coherence magic. Some people may remember it from conversations at PLDI 2024. This makes parallel matching scale well with the number of threads.The functionality is hidden behind
parallel-matching
feature. When the feature is activated, you can setparallel_matching
totrue
inRunner
to use it.Atomics in the
UnionFind
are always used, regardless of theparallel-matching
feature. There could be two implementations, the new one and the old one, depending onparallel-matching
, although in my opinion it would add unnecessary maintenance overhead. Opinions on this topic are welcome.