Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(nns): Add pagination to list_neurons API #3358

Open
wants to merge 11 commits into
base: master
Choose a base branch
from

Conversation

max-dfinity
Copy link
Contributor

No description provided.

@github-actions github-actions bot added the feat label Jan 7, 2025
@max-dfinity max-dfinity force-pushed the msum/list-neurons-pagination branch 5 times, most recently from a654324 to 2d22844 Compare January 8, 2025 00:32
@max-dfinity max-dfinity marked this pull request as ready for review January 8, 2025 00:44
@max-dfinity max-dfinity requested review from a team as code owners January 8, 2025 00:44
Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this pull request affects the behavior of any canister owned by
the Governance team, remember to update the corresponding
unreleased_changes.md file(s).

To acknowldge this reminder (and unblock the PR), dismiss this
code review by going to the bottom of the pull request page, and
supply one of the following reasons:

  1. Done.

  2. No canister behavior changes.

@@ -435,6 +435,7 @@ fn handle_list_neurons(
include_neurons_readable_by_caller: true,
include_empty_neurons_readable_by_caller: None,
include_public_neurons_in_full_neurons: None,
start_from_neuron_id: None,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dfinity/finint - I am wondering if the API needs to be updated to take advantage of the paging. It's likely going to start at a limit of 500, so inactive neurons won't be returned, which means in practice no users will use the paging. However, if a user had over 500 active neurons, they would not get them all in a single request.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you saying that active neurons will always be in earlier pages than inactive neurons? How does that work with start_from_neuron_id? It can't be that inactive neurons always have larger neuron IDs, right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@max-dfinity IIUC, this ListNeurons request type would need to be modified to also take a start_from_neuron_id value. Again IIUC, what is returned to the Rosetta client is an OperationOutput::ListNeuronsResponse enum variant, which contains the (in this PR) updated ic_nns_governance_api::pb::v1::ListNeuronsResponse with the next_start_from_neuron_id field. It would then be up to the user calling Rosetta to pass this value in a subsequent call to the list neurons construction endpoint. @fsodre probably has a better understanding!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @mbjorkqvist

@fsodre would that be a good idea - to add the parameter to the Rosetta ListNeurons type? Alternately, we could have the Rosetta API do the subsequent retrievals. This would largely only affect users with lots of old inactive neurons. I'm not aware of any user with more than 500 active neurons, and I don't really understand the use case for the Rosetta API.

@max-dfinity max-dfinity force-pushed the msum/list-neurons-pagination branch from 6804e43 to 4fb4103 Compare January 8, 2025 00:48
@@ -3350,6 +3350,12 @@ pub struct ListNeurons {
/// existing (unmigrated) callers.
#[prost(bool, optional, tag = "4")]
pub include_public_neurons_in_full_neurons: Option<bool>,
/// If this is set, it skips all neuron IDs until this value is reached, then returns
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the neuron ID have to be an ID that exists?
Or is it allowed to pass lastNeuronIdFromPreviousPage + 1 in order to continue after the previous page?

Also it could be clearer whether the skipped neurons have higher IDs or lower IDs than the passed one. For example when fetching transactions, the newest transactions are in the first page.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dskloetd it doesn't have to exist, but it would need to be what's returned from the previous request. If the neuron didn't exist, it would have been in the list that was being queried.

Typically the skipped neurons will have lower IDs b/c of the BTreeMap, but that's not a guarantee of the API by design - it's only that if you call it with the same request but with 'start_from_neuron_id' that you got from the last response, it will give you another set.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand the implementation correctly, providing a non-existing ID will make it return an empty result (since it'll skip results until it sees an id that doesn't exist). Would it be possible that the id returned by next_start_from_neuron_id is deleted between getting that value and requesting the new page?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fsodre Neurons aren't deleted, so that's not possible.

The only way for the neuron to be non-existing is if the user supplies it, but in that case, we will still see it in the list of neuron ids that are skipped b/c it was user provided. Typically the user should only use a value that was in the reply, b/c that will guarantee it exists in the list (assuming they don't change other request parameters).

Copy link
Member

@mbjorkqvist mbjorkqvist left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding this, @max-dfinity!

If a client requests multiple pages of list neurons, I suppose it is possible that the number of neurons matching the query may change in the meantime. Is this something that should be taken into account, or is the current "best effort" approach deemed good enough? I could imagine that if a new neuron is added during the requests, it would have a high ID and end up at the end of the list (in the last request)? I tried to look for more details of the approach, but neither this PR, nor the related JIRA ticket has any description, and the design linked in the epic also doesn't seem to cover this work.

@@ -435,6 +435,7 @@ fn handle_list_neurons(
include_neurons_readable_by_caller: true,
include_empty_neurons_readable_by_caller: None,
include_public_neurons_in_full_neurons: None,
start_from_neuron_id: None,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@max-dfinity IIUC, this ListNeurons request type would need to be modified to also take a start_from_neuron_id value. Again IIUC, what is returned to the Rosetta client is an OperationOutput::ListNeuronsResponse enum variant, which contains the (in this PR) updated ic_nns_governance_api::pb::v1::ListNeuronsResponse with the next_start_from_neuron_id field. It would then be up to the user calling Rosetta to pass this value in a subsequent call to the list neurons construction endpoint. @fsodre probably has a better understanding!

@@ -3350,6 +3350,12 @@ pub struct ListNeurons {
/// existing (unmigrated) callers.
#[prost(bool, optional, tag = "4")]
pub include_public_neurons_in_full_neurons: Option<bool>,
/// If this is set, it skips all neuron IDs until this value is reached, then returns
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand the implementation correctly, providing a non-existing ID will make it return an empty result (since it'll skip results until it sees an id that doesn't exist). Would it be possible that the id returned by next_start_from_neuron_id is deleted between getting that value and requesting the new page?

Comment on lines +2474 to +2481
if count >= MAX_LIST_NEURONS_RESULTS {
next_start_from_neuron_id = Some(neuron_id.id);
break;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct me if I'm wrong, but current users of this endpoint currently expect all neuron ids to be returned, but here we're breaking that assumption, which can silently cause problems for the users working under that assumption.

Could we consider not limiting the number of results if start_from_neuron_id is not provided? Or introducing pagination in a new endpoint?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fsodre The reason for the pagination is that without it, some users cannot access their neurons at all b/c the endpoint runs out of instructions or the request is too big (i.e. they have many many neurons).

There is only a small handful of users who this will affect.

The question I'm hoping to get feedback about is whether or not the rosetta api should hide this detail, or pass it along.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can add pagination data to Rosetta's request and sequentially fetch all pages if that's not provided.

@max-dfinity max-dfinity force-pushed the msum/list-neurons-pagination branch from 539e0b2 to 282aa55 Compare January 15, 2025 19:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants