Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

spec: Render clear the distinction of a Proposal for v and the full value v #365

Open
Tracked by #367
cason opened this issue Sep 5, 2024 · 8 comments
Open
Tracked by #367
Labels
spec Related to specifications tech-debt Not high priority, to eventually be addressed.

Comments

@cason
Copy link
Contributor

cason commented Sep 5, 2024

The arXiv paper considers that values are included in Proposal messages, which have the format ⟨PROPOSAL, hp, r, v, vr⟩, while vote messages (PREVOTE and PRECOMMIT) carry only id(v), which is the ID or short representation of v.

In practical implementations, like in CometBFT, a Proposal message carries id(v) while the full value v is transmitted in a different way (e.g., split into multiple block parts). So Proposal messages are fixed-size, as vote messages.

Also in the arXiv paper, we require in lines 26 (prevote to precommit step transition) and 49 (precommit step to decision), the reception of ⟨PROPOSAL, hp, r, v, ∗⟩ from proposer(hp, r). For all effects, however, we only need to known v, which is the full value referenced by the 2f + 1 votes for id(v). We don't need to check if v was proposed by proposer(hp, r) because this check is part of the propose to prevote step transition. We cannot prevote a value v if we don't receive ⟨PROPOSAL, hp, r, v, ∗⟩ from proposer(hp, r), but we can issue a precommit for and decide v without checking this assertion.

Moreover, notice that in the Quint model for consensus there is no distinction between v and id(v).


An elegant way to address this confusion between v and id(v) is to assume, as we do in the Quint model, that they are the same, i.e., id(v) == v. This means that consensus actual orders and decides on value IDs.

But in this case, how we check that base on the value ID received in consensus we have the full associated value? We can, using the nomenclature of the paper, use the valid(v) function to perform this check. Notice that valid(v) is a requirement for all state machine transitions that lead to progress, lines 22, 28, 36, and 49.

The valid(v) primitive should therefore verify whether we know the full value proposed by the Proposal message, and whether once receiving a 2f + 1 quorum of prevotes or precommits for id(v) == v we can precommit or decide v.

Proposal

  1. Remove the ⟨PROPOSAL, hp, r, v, ∗⟩ from proposer(hp, r) requirement for actions in lines 36 and 49 of the algorithm.
  2. Make sure that the valid(v) implementation checks for the full value referenced in the paper as v or id(v)
  3. Update the model, and possibly the implementation, accordingly.
@cason cason added the spec Related to specifications label Sep 5, 2024
@cason cason changed the title spec: render clear the distinction of a Proposal for v and the actual (full) value v spec: render clear the distinction of a Proposal for v and the full value v Sep 5, 2024
@josef-widder
Copy link
Member

Are you proposing an optimization to the following scenario?

  • I receive the proposal value (block) from the proposer of round 0 but no-one decides
  • In round 1, the new proposer reproposes the same value
  • Then in line 36 and 49, I actually don't need the proposal (block) from the round proposer 1, and e.g., I could decide in line 49 based on the commits.

I guess it makes sense to address this scenario, as it will reduce the traffic in rounds greater than 0. I wonder whether we need to change the consensus algorithm for that.

Did you think about the following option:

  • we define upon ⟨PROPOSAL, h, r, v, ∗⟩ to mean that
    • I have received all the block parts
    • I have received ⟨PROPOSAL, h, r, id(v), ∗⟩, that is, the small message.

The idea would be that in round 1, we use a fresh small message from the proposer of round 1, and we re-use the block parts of round 0. If the received id(v) matches the hash of the block parts that might work.

Data-flow-wise, the idea would be that if I receive the small message, I ask the block propagation module whether it has a complete block that matches the id(v), if yes I can issue the apropriate consenusInput.

@cason
Copy link
Contributor Author

cason commented Sep 25, 2024

Yes, I agree with your proposal for lines 22 and 28.

For 36 and 49 we do not need the "small" Proposal message, only the full value voted for by 2f + 1 processes.

@romac
Copy link
Member

romac commented Sep 25, 2024

If I understand correctly, in the algorithm we only need v to compute valid(v), right?

I am asking because in the code, we compute valid(v) outside of the state machine (it's the responsibility of the Host to do so).

So if only valid(v) and id(v) are needed in the state machine, then we could keep the full value v outside of the driver and only pass valid(v) and id(v) to the driver/state machine. Does that make sense?

If so, then we could potentially remove the Value type definition (which corresponds to v) from the Context trait and change core consensus to only ever deal with ValueId (which corresponds to id(v) which would simplify the code a little bit.

@cason
Copy link
Contributor Author

cason commented Sep 25, 2024

That is my original proposal, essentially. Also, valid(id(v)) should also be responsible for making sure that v was received and properly stored.

@josef-widder
Copy link
Member

Also, valid(id(v)) should also be responsible for making sure that v was received and properly stored.

We have to be a bit careful here. Because validity is checked outside the consensus statemachine. Consider the case in Line 28, when I don't have the full value, then this shouldn't mean that valid(id(v)) evaluates to false, because then we would prevote nil in line 32 (ProposalAndPolkaAndInvalidCInput). We shouldn't do that. In this case the algorithm requires us to neither have ProposalAndPolkaAndInvalidCInput nor ProposalAndPolkaPreviousAndValidCInput, at this point.

We should do ProposalAndPolkaAndInvalidCInput only when the value is bad, but not when the value is missing.

@cason
Copy link
Contributor Author

cason commented Sep 26, 2024

This is true. We would need to somehow adapt the block from lines 22 and 28 to account for that.

@cason
Copy link
Contributor Author

cason commented Sep 26, 2024

So, in general we can have either:

  • id(v) is known but v is missing (the new concept of valid(v) which is not an invariant, but a state check)
  • v is known but it is invalid (from the application point of view, namely, the original concept of valid(v))

that must be distinguished.

@josef-widder is probably convincing me that his approach is better or maybe we can find a compromise.

@ancazamfir ancazamfir mentioned this issue Oct 3, 2024
6 tasks
@cason cason added the tech-debt Not high priority, to eventually be addressed. label Nov 22, 2024
@cason
Copy link
Contributor Author

cason commented Nov 25, 2024

Reporting here what @matan-starkware has written to me:

Also, in Starknet the current plan is that the ID is produced by the valid(v) function. So we wouldn't actually see the conflict as the invalid proposal wouldn't actually be identified.

The broader context was the discussion regarding the implementation of the valid(v) primitive, discussed in #510.

@romac romac changed the title spec: render clear the distinction of a Proposal for v and the full value v spec: Render clear the distinction of a Proposal for v and the full value v Dec 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
spec Related to specifications tech-debt Not high priority, to eventually be addressed.
Projects
None yet
Development

No branches or pull requests

3 participants