-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Head tracking batch #417
base: main
Are you sure you want to change the base?
Head tracking batch #417
Conversation
@dipkakwani Maybe I should have been more explicit about it. This is a sketch of the idea. I wanted to gather some input while running the benchmarks with the current setup. |
{ | ||
// the 0th page will have the properly number set to first free page | ||
_roots[0].Data.NextFreePage = DbAddress.Page(_historyDepth); | ||
// The start root must have the properly number set to first free page |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: requires grammar fix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you suggest a nice way of stating this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not exactly sure, but do you mean to state:
"The start root's NextFreePage must point to the very first free page"?
@@ -115,13 +112,14 @@ private PagedDb(IPageManager manager, byte historyDepth) | |||
_lastWriteTxBatch = _meter.CreateAtomicObservableGauge($"Last written {BatchIdName}", BatchIdName, | |||
"The last "); | |||
|
|||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Issue in the existing code (one line above): _lastWriteTxBatch description is incomplete.
_pageTable.Remove(at); | ||
_pageTableReversed.Remove(actual); | ||
|
||
ref var cached = ref GetPageTableCacheSlot(at); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this cache be checked before the _pageTable?
If it's present in cache, we can be sure it exists in _pageTable and directly remove it from both the places? If it's not available, we will have to read the _pageTable to find out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find it easier to reason about as it's structured now, because in the current take, we treat _pageTable
as a source of truth and a cache only as an optimization. The time spent in this part is small when comparing to, for example, the application time. Also, most likely all the pages that are tried to be found will be found, so not sure if worth to do it the other way. Maybe we could optimize and .Remove
always and try to fix when removed incorrectly? Still, the measured overhead was terribly small.
private readonly ReaderWriterLockSlim _readerLock = new(); | ||
|
||
// Proposed batches that are finalized | ||
private readonly HashSet<Keccak> _beingFinalized = new(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With this new approach, we are storing a lot of data in-memory and this is not fetched from buffer pool/existing set of DB pages. Couple of general questions:
- How big can this grow? What does benchmarks show?
- What possible worst-case scenario can arise in case of a crash, since we would lose everything in-memory which had not been checkpointed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- We copy/store only pages that are written. This means roughly ~2000 pages per block which gives 8MB per block. This, multiplied the estimated number of not finalized blocks set to ~64 gives us ~500MB of memory. From what I recall when running a node with it, it was much lower though.
- We lose what was not finalized. The worst scenario is that once the node is restarted it gets the last remembered hash and processes what is missing. This is not different from the current one.
} | ||
} | ||
|
||
private class MultiHeadChain : IMultiHeadChain |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be good to add some descriptive comment here to summarize this class/approach?
In general, some of the classes/interfaces in this file have their description missing.
This PR introduces a notion of a
IMultiHeadChain
. The multi head approach enables creating multiple heads of the blockchain that track the recent state properly. If no reorganization happens, the sameIHead
is used over and over for a prolonged period of time while ensuring that all the reads are served in a speed of reading the raw underlying database. This is done by applying COW not only on the underlying database level but also in the head with a help of aPageTable
. With this we have:ancestors
' notionThe reusability is addressed by reusing the head tracking branch through multiple blocks. Every time a block is committed, it's state is applied to internal state of
HeadTrackingBatch
in a specific manner, allowing tracking of blocks.HeadTrackingBatch
has no notion of ancestors as it keeps all the modified pages in a simple lookup, that allows to perceive a squashed view of the world.No
BitFilter
is requires as data are always queried like there would be from the database. The only difference is that some pages are represented as modified in-memory copies. These copies eventually are applied to the disk, but for sake of lookups, are treated like there were applied already.IHead
The idea is implemented by introduction of one more level of abstraction over a batch. Before this PR, only a single
PageDb.Batch
could exist. This PR introduces a new type of the batch, calledIMultiHeadChain
. From the chain you can getIHead
, that accumulates changes in memory, meaning, that at a given batch number, multiple propositions can exist before they are committed/discarded. To make it work, the head introduces two dictionaries that allow resolvingPage
<->DbAddress
mapping in both sides. There are used to track pages that were already overwritten. It's possible that a given page will be overwritten multiple times though, so that the mappings keep only the last version. The mapping using dictionary seems to be costly (it was shown in the profiling). To address this, a dummy array-based lookup is provided in front of it.ProposedBatch
A simple construct of a
ProposedBatch
is used to track:(DbAddress,Page)
tuplesRootPage
These proposed blocks are created every time
HeadTrackingBatch
is committed, by selecting pages that were updated during the given batch. Once a finality is reached, proposed batches that have block numbers lower than the finalized one, are applied on the database. The application should be fast and almost instantaneous as it copies payloads to proper pages, updates theRootPage
and decreases the counter on theProposedBatch
(counter based managed similar to blockchain).Readers
Readers manage the read access. They are created whenever a block is committed/finalized and are just retrieved by the client code. No readers should be created in the hot paths!
Benchmarks
Some areas of interest
BeforeCommit
that represents MerkleizationCommitImpl
that applies the changes to the in-memory representation. Potentially can be offloadedCommit
that registers the proposal