-
Notifications
You must be signed in to change notification settings - Fork 310
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Address Leiden numbering issue #4845
Changes from 4 commits
4d003a0
3e06c06
0675c73
2b1ac70
fdf09de
ce22ea5
649ea7f
cfdba93
9e24488
2f3a737
9489ea0
1ffe352
c70a229
7e4651f
66bd69d
485ab35
d1d3687
2f00a89
c9e2781
c9945cd
f4d811a
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,5 +1,5 @@ | ||
/* | ||
* Copyright (c) 2022-2024, NVIDIA CORPORATION. | ||
* Copyright (c) 2022-2025, NVIDIA CORPORATION. | ||
* | ||
* Licensed under the Apache License, Version 2.0 (the "License"); | ||
* you may not use this file except in compliance with the License. | ||
|
@@ -713,6 +713,57 @@ std::pair<size_t, weight_t> leiden( | |
|
||
detail::flatten_leiden_dendrogram(handle, graph_view, *dendrogram, clustering); | ||
|
||
// Get unique cluster id | ||
size_t local_num_verts = (*dendrogram).get_level_size_nocheck(0); | ||
rmm::device_uvector<vertex_t> unique_cluster_ids(local_num_verts, handle.get_stream()); | ||
|
||
thrust::copy(handle.get_thrust_policy(), | ||
clustering, | ||
clustering + local_num_verts, | ||
unique_cluster_ids.begin()); | ||
|
||
thrust::sort(handle.get_thrust_policy(), unique_cluster_ids.begin(), unique_cluster_ids.end()); | ||
|
||
unique_cluster_ids.resize(thrust::distance(unique_cluster_ids.begin(), | ||
thrust::unique(handle.get_thrust_policy(), | ||
unique_cluster_ids.begin(), | ||
unique_cluster_ids.end())), | ||
handle.get_stream()); | ||
|
||
if constexpr (multi_gpu) { | ||
auto recvcounts = cugraph::host_scalar_allgather( | ||
handle.get_comms(), unique_cluster_ids.size(), handle.get_stream()); | ||
|
||
std::vector<size_t> displacements(recvcounts.size()); | ||
std::exclusive_scan(recvcounts.begin(), recvcounts.end(), displacements.begin(), size_t{0}); | ||
rmm::device_uvector<vertex_t> allgathered_unique_cluster_ids( | ||
displacements.back() + recvcounts.back(), handle.get_stream()); | ||
cugraph::device_allgatherv(handle.get_comms(), | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. By doing an allgatherv we are assuming that the entire list of cluster ids will fit in the available GPU memory of all GPUs. It's not clear to me... if we have a large graph on thousands of GPUs that doesn't cluster well that this is a safe assumption. It's probably safer (for scalability purposes) to shuffle things to different GPUs and each generate their own unique subset. So I'd suggest:
|
||
unique_cluster_ids.begin(), | ||
allgathered_unique_cluster_ids.begin(), | ||
recvcounts, | ||
displacements, | ||
handle.get_stream()); | ||
|
||
thrust::sort(handle.get_thrust_policy(), | ||
allgathered_unique_cluster_ids.begin(), | ||
allgathered_unique_cluster_ids.end()); | ||
|
||
allgathered_unique_cluster_ids.resize( | ||
thrust::distance(allgathered_unique_cluster_ids.begin(), | ||
thrust::unique(handle.get_thrust_policy(), | ||
allgathered_unique_cluster_ids.begin(), | ||
allgathered_unique_cluster_ids.end())), | ||
handle.get_stream()); | ||
|
||
detail::relabel_cluster_ids<vertex_t, multi_gpu>( | ||
handle, allgathered_unique_cluster_ids, clustering, local_num_verts); | ||
|
||
} else { | ||
detail::relabel_cluster_ids<vertex_t, multi_gpu>( | ||
handle, unique_cluster_ids, clustering, local_num_verts); | ||
} | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. LGTM, but please test it rigorously. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You need to comment out the lines |
||
return std::make_pair(dendrogram->num_levels(), modularity); | ||
} | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dendrogram->get_level_size_nocheck(0);