Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support full spork history across HCU version boundaries (PoC) #6540

Open
7 tasks
franklywatson opened this issue Oct 7, 2024 · 2 comments
Open
7 tasks
Labels
Epic Preserve Stale Bot repellent Protocol Team: Issues assigned to the Protocol Pillar.

Comments

@franklywatson
Copy link
Contributor

franklywatson commented Oct 7, 2024

Product relevance and impact

When Flow performs Height Coordinated Updates (HCU), the current design of Access Nodes mandates that previous history cannot be served from prior to that version because of the risk of returning different results for older blocks. More specifically, scripts cannot be executed on blocks that came before the height at which the HCU was executed since the Cadence version might have been updated during the HCU. This limiter to previous spork history is a major issue for EVM users/builders who expect full fidelity access to the spork history regardless of any HCUs.

Please see the comment on product relevance and impact below for more details.

Problem definition

EVM builders and users cannot tolerate segmented or missing blockchain history and our EVM equivalence credentials require us to hide Flow specific version details from the EVM experience.

How will we measure success

  • Access node can serve script execution on blocks going as far back as the spork start all the while ensuring that script execution results are consistent with the Cadence version at that block height.
  • EVM GW history support works as builders expect

Breakdown

This work can be split into to chunks with very limited overlap

KR1: AN supports script execution across breaking HCU version boundaries

Here, we would be using the "old version beacon" (currently still used for mainnet) or some simple stop gap-solution in case KR2 wasn't ready yet.

DACI

❓DACI tbd ❓ ... probably mostly Peter and Krok as D and C. as Informed, we should have Dete, Vishal and Bastian.

Driver Approver Consulted Informed
KROK team @peterargue Flow Team Dete, Vishal, Bastian

KR2: Evolve Versioning of Execution Stack to use dynamic protocol state

The current mechanism to signal a change in the node software to all nodes has several limitations. There is a plan to improve and update this mechanism detailed in Flip 298. Access nodes can use this new mechanism to more robustly serve full spork history across HCU version boundaries and achieve the objective of this OKR.

This is hopefully a lighter-weight engineering task (👉 issue #6788), though it requires broader alignment across the protocol team (this work would be one puzzle piece of the much broader vision Flip 298 and we need to make sure the refactored EN version beacon fits into the longer-term vision outlined in the Flip).

DACI

❓we might need to update the Drivers on this: I think Janez might be one of the main technical drivers of this work (on the EN side and Jordan on the Protocol side) -- we could potentially remove Yurii❓

D: Jordan, Yurii, SCE
A: Alex, Peter
C: Dete
I: Vishal

Driver Approver Consulted Informed
Jordan, Yurii, SCE Alex, Peter Dete Vishal

Task breakdown

Preview Give feedback
  1. Preserve
  2. Preserve
  3. Preserve Stale
  4. janezpodhostnik
@franklywatson franklywatson added Preserve Stale Bot repellent Epic labels Oct 7, 2024
@franklywatson franklywatson changed the title [Epic] Support full spork history across across HCU version boundaries Support full spork history across across HCU version boundaries Oct 10, 2024
@franklywatson franklywatson changed the title Support full spork history across across HCU version boundaries Support full spork history across across HCU version boundaries (PoC) Oct 10, 2024
@franklywatson franklywatson added the Protocol Team: Issues assigned to the Protocol Pillar. label Dec 19, 2024
@AlexHentschel AlexHentschel changed the title Support full spork history across across HCU version boundaries (PoC) Support full spork history across HCU version boundaries (PoC) Dec 19, 2024
@AlexHentschel
Copy link
Member

Contex on product relevance and impact

status quo

  • At the moment, we plan major network upgrades (aka sporks) to be at least 12 months appart

  • Security fixes require updating the transaction execution environment in between (typically via HCUs).

  • Optimistically, we could assume that all updates of the transaction execution environment via HCUs are fully downwards compatible. Nevertheless, there are a of problems and potentially high cost associated with this assumption:

    • Many security fixes must introduce a breaking behaviour change in the area that was previously vulnerable. Only when fixing security edge-cases that have not yet been exploited, the addition of the fix doesn't break with previous results.
    • In some cases, it might be dramatically simpler to implement a security fix that has a marginally broader change surface compared to what is minimally required to mitigate some attack. If we limit ourselves to full/maximal downwards compatibility we might incur significantly larger engineering cost, even though from a product perspective, a the broader change might have just been fine (e.g. only a few close partners would be affected by the broader change, and they are happy to update their code minimally).
    • Some security fixed might necessitate a broader behaviour change.

    So for a limited time, it should be fine to assume that every HCU can be made fully downwards compatible. However, in the long term, this assumption poses a notable strategic risk (for our ability to quickly fix attack vectors as well as resource commitments).

  • Even if we manage to make all HCUs full downwards compatible, we will likely still have major behaviour changes at Spork boundaries for the foreseeable future.

    • We envision Archive Nodes to provide script execution service across spork boundaries.

Access Nodes providing script execution across breaking HCUs is a technical precursor to Archive Nodes spanning spork boundaries.

Benefits of ANs supporting script execution across breaking HCUs

  • reduction of major strategic complexity and resourcing risks
  • significantly more options for us to evolve cadence, including breaking changes where beneficial for the overall platform

Tradeoffs:

  • Containerization provides means to run different software versions in parallel should this be desired. This shifts the complexity away from the protocol implementation to the operational layer, where sophisticated, widely-adopted, off-the-shelf solutions already exist.
  • For scalability of script execution, it is highly beneficial for Access Nodes to have a single local replica of the execution state that can be accessed by multiple script execution micro-services with that Access Node. We are anyway intending to update ANs to this design for scalability. Having different contains run different versions of the script execution environment would be only a marginal generalization.

Hence, we concluded that supporting full history across HCU version boundaries is a low-hanging fruit in the context of our current development roadmap.

@AlexHentschel
Copy link
Member

Resourcing conditions are good at the moment for this particular work: this is a relatively well-understood and straight-forward engineering work; the majority of work can be done by the Krok team, which has some capacity available.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Epic Preserve Stale Bot repellent Protocol Team: Issues assigned to the Protocol Pillar.
Projects
None yet
Development

No branches or pull requests

2 participants