-
Notifications
You must be signed in to change notification settings - Fork 325
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(nns): Add pagination to list_neurons API #3358
base: master
Are you sure you want to change the base?
Conversation
a654324
to
2d22844
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this pull request affects the behavior of any canister owned by
the Governance team, remember to update the corresponding
unreleased_changes.md file(s).
To acknowldge this reminder (and unblock the PR), dismiss this
code review by going to the bottom of the pull request page, and
supply one of the following reasons:
-
Done.
-
No canister behavior changes.
@@ -435,6 +435,7 @@ fn handle_list_neurons( | |||
include_neurons_readable_by_caller: true, | |||
include_empty_neurons_readable_by_caller: None, | |||
include_public_neurons_in_full_neurons: None, | |||
start_from_neuron_id: None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dfinity/finint - I am wondering if the API needs to be updated to take advantage of the paging. It's likely going to start at a limit of 500, so inactive neurons won't be returned, which means in practice no users will use the paging. However, if a user had over 500 active neurons, they would not get them all in a single request.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you saying that active neurons will always be in earlier pages than inactive neurons? How does that work with start_from_neuron_id
? It can't be that inactive neurons always have larger neuron IDs, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@max-dfinity IIUC, this ListNeurons request type would need to be modified to also take a start_from_neuron_id
value. Again IIUC, what is returned to the Rosetta client is an OperationOutput::ListNeuronsResponse enum variant, which contains the (in this PR) updated ic_nns_governance_api::pb::v1::ListNeuronsResponse
with the next_start_from_neuron_id
field. It would then be up to the user calling Rosetta to pass this value in a subsequent call to the list neurons construction endpoint. @fsodre probably has a better understanding!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @mbjorkqvist
@fsodre would that be a good idea - to add the parameter to the Rosetta ListNeurons type? Alternately, we could have the Rosetta API do the subsequent retrievals. This would largely only affect users with lots of old inactive neurons. I'm not aware of any user with more than 500 active neurons, and I don't really understand the use case for the Rosetta API.
6804e43
to
4fb4103
Compare
@@ -3350,6 +3350,12 @@ pub struct ListNeurons { | |||
/// existing (unmigrated) callers. | |||
#[prost(bool, optional, tag = "4")] | |||
pub include_public_neurons_in_full_neurons: Option<bool>, | |||
/// If this is set, it skips all neuron IDs until this value is reached, then returns |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does the neuron ID have to be an ID that exists?
Or is it allowed to pass lastNeuronIdFromPreviousPage + 1
in order to continue after the previous page?
Also it could be clearer whether the skipped neurons have higher IDs or lower IDs than the passed one. For example when fetching transactions, the newest transactions are in the first page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dskloetd it doesn't have to exist, but it would need to be what's returned from the previous request. If the neuron didn't exist, it would have been in the list that was being queried.
Typically the skipped neurons will have lower IDs b/c of the BTreeMap, but that's not a guarantee of the API by design - it's only that if you call it with the same request but with 'start_from_neuron_id' that you got from the last response, it will give you another set.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand the implementation correctly, providing a non-existing ID will make it return an empty result (since it'll skip results until it sees an id that doesn't exist). Would it be possible that the id returned by next_start_from_neuron_id
is deleted between getting that value and requesting the new page?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@fsodre Neurons aren't deleted, so that's not possible.
The only way for the neuron to be non-existing is if the user supplies it, but in that case, we will still see it in the list of neuron ids that are skipped b/c it was user provided. Typically the user should only use a value that was in the reply, b/c that will guarantee it exists in the list (assuming they don't change other request parameters).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for adding this, @max-dfinity!
If a client requests multiple pages of list neurons, I suppose it is possible that the number of neurons matching the query may change in the meantime. Is this something that should be taken into account, or is the current "best effort" approach deemed good enough? I could imagine that if a new neuron is added during the requests, it would have a high ID and end up at the end of the list (in the last request)? I tried to look for more details of the approach, but neither this PR, nor the related JIRA ticket has any description, and the design linked in the epic also doesn't seem to cover this work.
@@ -435,6 +435,7 @@ fn handle_list_neurons( | |||
include_neurons_readable_by_caller: true, | |||
include_empty_neurons_readable_by_caller: None, | |||
include_public_neurons_in_full_neurons: None, | |||
start_from_neuron_id: None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@max-dfinity IIUC, this ListNeurons request type would need to be modified to also take a start_from_neuron_id
value. Again IIUC, what is returned to the Rosetta client is an OperationOutput::ListNeuronsResponse enum variant, which contains the (in this PR) updated ic_nns_governance_api::pb::v1::ListNeuronsResponse
with the next_start_from_neuron_id
field. It would then be up to the user calling Rosetta to pass this value in a subsequent call to the list neurons construction endpoint. @fsodre probably has a better understanding!
4fb4103
to
8e8c4c1
Compare
@@ -3350,6 +3350,12 @@ pub struct ListNeurons { | |||
/// existing (unmigrated) callers. | |||
#[prost(bool, optional, tag = "4")] | |||
pub include_public_neurons_in_full_neurons: Option<bool>, | |||
/// If this is set, it skips all neuron IDs until this value is reached, then returns |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand the implementation correctly, providing a non-existing ID will make it return an empty result (since it'll skip results until it sees an id that doesn't exist). Would it be possible that the id returned by next_start_from_neuron_id
is deleted between getting that value and requesting the new page?
if count >= MAX_LIST_NEURONS_RESULTS { | ||
next_start_from_neuron_id = Some(neuron_id.id); | ||
break; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct me if I'm wrong, but current users of this endpoint currently expect all neuron ids to be returned, but here we're breaking that assumption, which can silently cause problems for the users working under that assumption.
Could we consider not limiting the number of results if start_from_neuron_id
is not provided? Or introducing pagination in a new endpoint?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@fsodre The reason for the pagination is that without it, some users cannot access their neurons at all b/c the endpoint runs out of instructions or the request is too big (i.e. they have many many neurons).
There is only a small handful of users who this will affect.
The question I'm hoping to get feedback about is whether or not the rosetta api should hide this detail, or pass it along.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can add pagination data to Rosetta's request and sequentially fetch all pages if that's not provided.
539e0b2
to
282aa55
Compare
No description provided.