-
Notifications
You must be signed in to change notification settings - Fork 92
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
mvccadapter: check if the last TID changed without invalidation #307
Conversation
This is a follow-up of #291. I submit this PR because it solves an issue that has to be fixed quickly. However, this is extra code that's useless most of the time so I prefer we change the interface, which may require a major version bump. |
a9ecb96
to
8d20f20
Compare
I'll merge on Friday. |
Can we get CI green, please? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I understand what's going on here and I think this looks OK, at least in isolation, once tests go green.
It needs a good change note (I think there's already something about #290 in there, so maybe just an updated change note mentioning this PR?).
Oops. Now fixed, with: --- a/src/ZODB/tests/testmvcc.py
+++ b/src/ZODB/tests/testmvcc.py
@@ -87,10 +87,10 @@
>>> cn._storage._start == p64(u64(st.lastTransaction()) + 1)
True
->>> cn.db()._mvcc_storage.invalidate(100, dict.fromkeys([1, 2]))
+>>> cn.db()._mvcc_storage.invalidate(p64(100), dict.fromkeys([1, 2]))
>>> cn._storage._start == p64(u64(st.lastTransaction()) + 1)
True
->>> cn.db()._mvcc_storage.invalidate(200, dict.fromkeys([1, 2]))
+>>> cn.db()._mvcc_storage.invalidate(p64(200), dict.fromkeys([1, 2]))
>>> cn._storage._start == p64(u64(st.lastTransaction()) + 1)
True
I also needed #310 to be merged. |
Awesome! |
# so we still have to poll. | ||
ltid = self._storage.lastTransaction() | ||
# But at this precise moment, a transaction may be committed and | ||
# we have already received the new tid, along with invalidations. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jmuchemb, @jamadden isn't the above true only if storage, upon receiving invalidation messages from server, first notifies DB.invalidate()
and only after updates zstor.last_tid
? For the reference this is indeed true for ZEO4:
but not true for NEO:
https://lab.nexedi.com/nexedi/neoppod/blob/fc58c089/neo/client/handlers/master.py#L109-124
If zstor.last_tid
is updated first before zstor
sends invalidations to DB (to MVCCAdapter here), what could happen is:
- poll_invalidations calls
zstor.lastTransaction
-> gets just committed tid; - poll invalidations proceeds and could retrieve
._invalidations
beforezstor
could actually update them with invalidation data corresponding totid
; - tid, and incomplete set of objects to invalidate is returned.
I.e. it will be a data corruption, because objects might remain stale in live cache with similar data corruption scenario as demonstrated in #290.
P.S. I could not quickly see which way ZEO5 implements. Maybe it is something different, or actually related to zopefoundation/ZEO#155.
P.P.S RelStorage is unaffected as it implements IMVCCStorage natively.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's indeed a problem between this MR & NEO :(
For ZEO4, the 2 operations are wrapped around the same lock so the order does not matter.
For ZEO5, it's first DB.invalidate()
then lastTransaction() updated.
So I have to answer 2 questions:
- Is ZEO5 behaviour enough ?
- If it is, can I do the same in NEO ?
zopefoundation/ZEO#155 is unrelated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jmuchemb, thanks for feedback.
Yes, I believe if we change DB.invalidate()
call to always happen before zstor.last_tid
update that would be enough. If we go this way this change also requires corresponding change in IStorage interface definiton. Something like below:
--- a/src/ZODB/interfaces.py
+++ b/src/ZODB/interfaces.py
@@ -685,6 +685,15 @@ def isReadOnly():
def lastTransaction():
"""Return the id of the last committed transaction.
+ Returned tid is ID of last committed transaction as observed from
+ some time _before_ lastTransaction call returns. In particular for
+ client-sever case, lastTransaction can return cached view of storage
+ that was learned some time ago.
+
+ It is guaranteed that for all IStorageWrappers, that wrap the storage,
+ invalidation notifications have been completed for transactions
+ with ID ≤ returned tid.
+
If no transactions have been committed, return a string of 8
null (0) characters.
"""
EDIT: corrected to "... some time before lastTransaction call was madereturns." due to @jmuchemb notice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, I found no way to cause any bug if lastTransaction() is updated just after invalidations are notified. I feared issues because NEO also uses lastTransaction() internally: note however it must do it before releasing its cache lock, otherwise "load current" fails on ZODB<5.
Updating lastTransaction() at the right moment also matters for "internal" invalidations: I mean that it must be done just after calling the tpc_finish callback.
https://lab.nexedi.com/nexedi/neoppod/commit/5870c5d79f68dce00267f3ae9c4e12d56d19dede should make NEO work with this MR. It includes tests for the 4 cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jmuchemb, thanks for feedback and adapting things on NEO side. I see about the need to update app.last_tid
while still under app._cache_lock
- as app._cache
and app.last_tid
need to be kept coherent on NEO side internally.
Updating lastTransaction() at the right moment also matters for "internal" invalidations: I mean that it must be done just after calling the tpc_finish callback.
Yes, thanks for pointing this out. There app.last_tid
has to be also set while still under app._cache_lock
and after call to app._cache.invalidate()
. As you say this matters for "load current" on ZODB<5. I thus suggest the following amendment for ZODB4 version of proposed IStorage.lastTransaction interface change (ZODB4 specifics in bold):
def lastTransaction():
"""Return the id of the last committed transaction.Returned tid is ID of last committed transaction as observed from
some time before lastTransaction call returns. In particular for
client-sever case, lastTransaction can return cached view of storage
that was learned some time ago.It is guaranteed that for all IStorageWrappers, that wrap the storage,
invalidation notifications have been completed for transactions
with ID ≤ returned tid.It is guaranteed that after lastTransaction returns, "current" view of
the storage as observed by load() is ≥ returned tid.If no transactions have been committed, return a string of 8
null (0) characters.
I believe it is very important we don't just fixup NEO and merge this patch without fixing IStorage interface. There are application-level cases that rely on exactly the same guarantees that MVCCAdapter needs here and got wrong in the first version of this patch. Also it has to be clear for all other potential storages and their authors - just from reading the interfaces - which semantic must be provided.
I created the following pull-requests to adjust the IStorage interface:
/cc @jamadden, @vpelletier, @arnaud-fontaine, @jwolf083, @klawlf82, @gidzit, @jimfulton
EDIT: corrected to "... some time before lastTransaction call was madereturns." due to @jmuchemb notice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe it is very important we don't just fixup NEO and merge this patch without fixing IStorage interface.
Yes. First, I wanted to be sure this MR is correct.
as observed from some time before lastTransaction call was made.
Strictly speaking, it's wrong because the value that lastTransaction reads could be updated between the call to lastTransaction start and the instruction that actually reads the value in question.
I suggest the following change:
--- a/src/ZODB/interfaces.py
+++ b/src/ZODB/interfaces.py
@@ -685,6 +685,14 @@ def isReadOnly():
def lastTransaction():
"""Return the id of the last committed transaction.
+ For proper MVCC operation, the return value is the id of the last
+ transaction for which invalidation notifications are completed.
+
+ In particular for client-server implementations, lastTransaction
+ should return a cached value (rather than querying the server).
+ A preliminary call to sync() can be done to get the actual last
+ TID at the wanted time.
+
If no transactions have been committed, return a string of 8
null (0) characters.
"""
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jmuchemb, thanks for feedback.
Strictly speaking, it's wrong ...
You are right, thanks for noticing it. It was a thinko on my side that I copied from ZODB/go. The correction is to require "as observed from some time before lastTransaction call was madereturns." I've amended ZODB/go and my two pull requests accordingly.
I suggest the following change:
I prefer my version, but am ok if ZODB/interfaces.py
is updated with your text, or any other text that clearly documents required semantic.
Because if lastTransaction() returns tid, for which local database handle has not yet been updated with invalidations, it could lead to data corruption due to concurrency issues similar to zopefoundation#290: - DB refreshes a Connection for new transaction; - zstor.lastTransaction() is called to obtain database view for this connection. - objects in live-cache for this Connection are invalidated with invalidations that were queued through DB.invalidate() calls from storage. - if lastTransaction does not guarantee that all DB invalidations for transactions with ID ≤ returned tid have been completed, it can be that: incomplete set of objects are invalidated in live cache i.e. data corruption. This particular data corruption has been hit when working on core of ZODB and was not immediately noticed: zopefoundation#307 (review) this fact justifies the importance of explicitly stating what IStorage guarantees are / must be in the interface. This guarantee - already holds for FileStorage (no database mutations from outside of single process); - is already true for ZEO4 and ZEO5 zopefoundation#307 (review) zopefoundation#307 (comment) - holds for RelStorage because it implements IMVCCStorage natively; - is *not* currently true for NEO because NEO sets zstor.last_tid before calling DB.invalidate: https://lab.nexedi.com/nexedi/neoppod/blob/fc58c089/neo/client/handlers/master.py#L109-124 However NEO is willing to change and already prepared the fix to provide this guarantee because of data corruption scenario that can happen without it: zopefoundation#307 (comment) https://lab.nexedi.com/nexedi/neoppod/commit/5870c5d7 In other words all storages that, to my knowledge, are in current use are either already providing specified semantic, or will be shortly fixed to provide it. This way we can fix up the interface and make the semantic clear. /cc @jamadden, @vpelletier, @arnaud-fontaine, @jwolf083, @klawlf82, @gitzit, @jimfulton
… ... ... and to provide correct "current" view for load(). This is ZODB4 backport of commit 9a1d29f (zopefoundation#313) It has been amended with ZODB4-specific It is guaranteed that after lastTransaction returns, "current" view of the storage as observed by load() is ≥ returned tid. because on ZODB<5 - contrary to ZODB5 - there is "load current" for which correct semantic must be also provided: zopefoundation#307 (comment) Original description follows: ---- 8< ---- Because if lastTransaction() returns tid, for which local database handle has not yet been updated with invalidations, it could lead to data corruption due to concurrency issues similar to zopefoundation#290: - DB refreshes a Connection for new transaction; - zstor.lastTransaction() is called to obtain database view for this connection. - objects in live-cache for this Connection are invalidated with invalidations that were queued through DB.invalidate() calls from storage. - if lastTransaction does not guarantee that all DB invalidations for transactions with ID ≤ returned tid have been completed, it can be that: incomplete set of objects are invalidated in live cache i.e. data corruption. This particular data corruption has been hit when working on core of ZODB and was not immediately noticed: zopefoundation#307 (review) this fact justifies the importance of explicitly stating what IStorage guarantees are / must be in the interface. This guarantee - already holds for FileStorage (no database mutations from outside of single process); - is already true for ZEO4 and ZEO5 zopefoundation#307 (review) zopefoundation#307 (comment) - holds for RelStorage because it implements IMVCCStorage natively; - is *not* currently true for NEO because NEO sets zstor.last_tid before calling DB.invalidate: https://lab.nexedi.com/nexedi/neoppod/blob/fc58c089/neo/client/handlers/master.py#L109-124 However NEO is willing to change and already prepared the fix to provide this guarantee because of data corruption scenario that can happen without it: zopefoundation#307 (comment) https://lab.nexedi.com/nexedi/neoppod/commit/5870c5d7 In other words all storages that, to my knowledge, are in current use are either already providing specified semantic, or will be shortly fixed to provide it. This way we can fix up the interface and make the semantic clear. /cc @jamadden, @vpelletier, @arnaud-fontaine, @jwolf083, @klawlf82, @gitzit, @jimfulton
Since commit b5895a5 ("mvccadapter: fix race with invalidations when starting a new transaction"), a ZEO test fails as follows: File "src/ZEO/tests/drop_cache_rather_than_verify.txt", line 114, in drop_cache_rather_than_verify.txt Failed example: conn.root()[1].x Expected: 6 Got: 1 Earlier in the test, the ZEO server is restarted and then another client commits. When disconnected, the first client does not receive invalidations anymore and the connection gets stuck in the past until there's a new commit after it reconnected. It was possible to make the test pass with the following patch: --- a/src/ZEO/ClientStorage.py +++ b/src/ZEO/ClientStorage.py @@ -357,6 +357,7 @@ def notify_connected(self, conn, info): # invalidate our db cache if self._db is not None: + self._db.invalidate(self.lastTransaction(), ()) self._db.invalidateCache() logger.info("%s %s to storage: %s", Other implementations like NEO are probably affected the same way. Rather than changing interfaces in a backward-incompatible way, this commit revert to the original behaviour, and all the changes that were done in existing tests are reverted. However, the interfaces are clarified about the fact that storage implementations must update at a precise moment the value that is returned by lastTransaction(): just after invalidate() or tpc_finish callback.
Because if lastTransaction() returns tid, for which local database handle has not yet been updated with invalidations, it could lead to data corruption due to concurrency issues similar to zopefoundation#290: - DB refreshes a Connection for new transaction; - zstor.lastTransaction() is called to obtain database view for this connection. - objects in live-cache for this Connection are invalidated with invalidations that were queued through DB.invalidate() calls from storage. - if lastTransaction does not guarantee that all DB invalidations for transactions with ID ≤ returned tid have been completed, it can be that: incomplete set of objects are invalidated in live cache i.e. data corruption. This particular data corruption has been hit when working on core of ZODB and was not immediately noticed: zopefoundation#307 (review) this fact justifies the importance of explicitly stating what IStorage guarantees are / must be in the interface. This guarantee - already holds for FileStorage (no database mutations from outside of single process); - is already true for ZEO4 and ZEO5 zopefoundation#307 (review) zopefoundation#307 (comment) - holds for RelStorage because it implements IMVCCStorage natively; - is *not* currently true for NEO because NEO sets zstor.last_tid before calling DB.invalidate: https://lab.nexedi.com/nexedi/neoppod/blob/fc58c089/neo/client/handlers/master.py#L109-124 However NEO is willing to change and already prepared the fix to provide this guarantee because of data corruption scenario that can happen without it: zopefoundation#307 (comment) https://lab.nexedi.com/nexedi/neoppod/commit/5870c5d7 In other words all storages that, to my knowledge, are in current use are either already providing specified semantic, or will be shortly fixed to provide it. This way we can fix up the interface and make the semantic clear. /cc @jamadden, @vpelletier, @arnaud-fontaine, @jwolf083, @klawlf82, @gitzit, @jimfulton
… ... ... and to provide correct "current" view for load(). This is ZODB4 backport of commit b32d52f (zopefoundation#313) It has been amended with ZODB4-specific It is guaranteed that after lastTransaction returns, "current" view of the storage as observed by load() is ≥ returned tid. because on ZODB<5 - contrary to ZODB5 - there is "load current" for which correct semantic must be also provided: zopefoundation#307 (comment) Original description follows: ---- 8< ---- Because if lastTransaction() returns tid, for which local database handle has not yet been updated with invalidations, it could lead to data corruption due to concurrency issues similar to zopefoundation#290: - DB refreshes a Connection for new transaction; - zstor.lastTransaction() is called to obtain database view for this connection. - objects in live-cache for this Connection are invalidated with invalidations that were queued through DB.invalidate() calls from storage. - if lastTransaction does not guarantee that all DB invalidations for transactions with ID ≤ returned tid have been completed, it can be that: incomplete set of objects are invalidated in live cache i.e. data corruption. This particular data corruption has been hit when working on core of ZODB and was not immediately noticed: zopefoundation#307 (review) this fact justifies the importance of explicitly stating what IStorage guarantees are / must be in the interface. This guarantee - already holds for FileStorage (no database mutations from outside of single process); - is already true for ZEO4 and ZEO5 zopefoundation#307 (review) zopefoundation#307 (comment) - holds for RelStorage because it implements IMVCCStorage natively; - is *not* currently true for NEO because NEO sets zstor.last_tid before calling DB.invalidate: https://lab.nexedi.com/nexedi/neoppod/blob/fc58c089/neo/client/handlers/master.py#L109-124 However NEO is willing to change and already prepared the fix to provide this guarantee because of data corruption scenario that can happen without it: zopefoundation#307 (comment) https://lab.nexedi.com/nexedi/neoppod/commit/5870c5d7 In other words all storages that, to my knowledge, are in current use are either already providing specified semantic, or will be shortly fixed to provide it. This way we can fix up the interface and make the semantic clear. /cc @jamadden, @vpelletier, @arnaud-fontaine, @jwolf083, @klawlf82, @gitzit, @jimfulton
…eturning ... ... and to provide correct "current" view for load(). This is ZODB4 backport of zopefoundation#313, which itself is just a more explicit language of zopefoundation@4a6b0283#diff-881ceb274f9e538d4144950eefce8682R685-R695 It has been amended with ZODB4-specific It is guaranteed that after lastTransaction returns, "current" view of the storage as observed by load() is ≥ returned tid. because on ZODB<5 - contrary to ZODB5 - there is "load current" for which correct semantic must be also provided: zopefoundation#307 (comment) Original description follows: ---- 8< ---- Because if lastTransaction() returns tid, for which local database handle has not yet been updated with invalidations, it could lead to data corruption due to concurrency issues similar to zopefoundation#290: - DB refreshes a Connection for new transaction; - zstor.lastTransaction() is called to obtain database view for this connection. - objects in live-cache for this Connection are invalidated with invalidations that were queued through DB.invalidate() calls from storage. - if lastTransaction does not guarantee that all DB invalidations for transactions with ID ≤ returned tid have been completed, it can be that: incomplete set of objects are invalidated in live cache i.e. data corruption. This particular data corruption has been hit when working on core of ZODB and was not immediately noticed: zopefoundation#307 (review) this fact justifies the importance of explicitly stating what IStorage guarantees are / must be in the interface. This guarantee - already holds for FileStorage (no database mutations from outside of single process); - is already true for ZEO4 and ZEO5 zopefoundation#307 (review) zopefoundation#307 (comment) - holds for RelStorage because it implements IMVCCStorage natively; - is *not* currently true for NEO because NEO sets zstor.last_tid before calling DB.invalidate: https://lab.nexedi.com/nexedi/neoppod/blob/fc58c089/neo/client/handlers/master.py#L109-124 However NEO is willing to change and already prepared the fix to provide this guarantee because of data corruption scenario that can happen without it: zopefoundation#307 (comment) https://lab.nexedi.com/nexedi/neoppod/commit/a7d101ec https://lab.nexedi.com/nexedi/neoppod/commit/a7d101ec#note_112238 https://lab.nexedi.com/nexedi/neoppod/commit/a7d101ec#note_113331 In other words all storages that, to my knowledge, are in current use are either already providing specified semantic, or will be shortly fixed to provide it. This way we can fix up the interface and make the semantic clear. /cc @jamadden, @vpelletier, @arnaud-fontaine, @jwolf083, @klawlf82, @gitzit, @jimfulton
…eturning ... ... and to provide correct "current" view for load(). This is ZODB4 backport of zopefoundation#313, which itself is just a more explicit language of zopefoundation@4a6b0283#diff-881ceb274f9e538d4144950eefce8682R685-R695 It has been amended with ZODB4-specific It is guaranteed that after lastTransaction returns, "current" view of the storage as observed by load() is ≥ returned tid. because on ZODB<5 - contrary to ZODB5 - there is "load current" for which correct semantic must be also provided: zopefoundation#307 (comment) Original description follows: ---- 8< ---- Because if lastTransaction() returns tid, for which local database handle has not yet been updated with invalidations, it could lead to data corruption due to concurrency issues similar to zopefoundation#290: - DB refreshes a Connection for new transaction; - zstor.lastTransaction() is called to obtain database view for this connection. - objects in live-cache for this Connection are invalidated with invalidations that were queued through DB.invalidate() calls from storage. - if lastTransaction does not guarantee that all DB invalidations for transactions with ID ≤ returned tid have been completed, it can be that: incomplete set of objects are invalidated in live cache i.e. data corruption. This particular data corruption has been hit when working on core of ZODB and was not immediately noticed: zopefoundation#307 (review) this fact justifies the importance of explicitly stating what IStorage guarantees are / must be in the interface. This guarantee - already holds for FileStorage (no database mutations from outside of single process); - is already true for ZEO4 and ZEO5 zopefoundation#307 (review) zopefoundation#307 (comment) - holds for RelStorage because it implements IMVCCStorage natively; - is *not* currently true for NEO because NEO sets zstor.last_tid before calling DB.invalidate: https://lab.nexedi.com/nexedi/neoppod/blob/fc58c089/neo/client/handlers/master.py#L109-124 However NEO is willing to change and already prepared the fix to provide this guarantee because of data corruption scenario that can happen without it: zopefoundation#307 (comment) https://lab.nexedi.com/nexedi/neoppod/commit/a7d101ec https://lab.nexedi.com/nexedi/neoppod/commit/a7d101ec#note_112238 https://lab.nexedi.com/nexedi/neoppod/commit/a7d101ec#note_113331 In other words all storages that, to my knowledge, are in current use are either already providing specified semantic, or will be shortly fixed to provide it. This way we can fix up the interface and make the semantic clear. /cc @jamadden, @vpelletier, @arnaud-fontaine, @jwolf083, @klawlf82, @gitzit, @jimfulton
This backports to ZODB4 Connection ZODB5's approach to handle MVCC via always calling storage.loadBefore() instead of "load for latest version + loadBefore if we were notified of database changes" approach. Why? ---- Short answer: because Wendelin.core 2 needs to know at which particular database state application-level ZODB connection is viewing the database, and it is hard to implement such functionality correctly without this backport. Please see appendix for the explanation. What ---- This backports to ZODB4 the minimum necessary part of upstream commit 227953b (Simplify MVCC by determining transaction start time using lastTransaction) + follow-up correctness fixes: zopefoundation#50 zopefoundation#56 zopefoundation#291 zopefoundation#307 In short: - a Connection is always opened with explicitly corresponding to a particular database revision - Connection uses only loadBefore with that revision to load objects - every time a Connection is (re)opened, the result of queued invalidations and explicit query to storage.lastTransaction is carefully merged to refresh Connection's idea about which database state it corresponds to. The "careful" in last point is important. Historically ZODB5 was first reworked in commit 227953b (zopefoundation#56) to always call lastTransaction to refresh state of Connection view. Since there was no proper synchronisation with respect to process of handling invalidations, that lead to data corruption issue due to race in Connection.open() vs invalidations: zopefoundation#290 That race and data corruption was fixed in commit b5895a5 (zopefoundation#291) by way of avoiding lastTransaction call and relying only on invalidations channel when refreshing Connection view. This fix in turn led to another data corruption issue because in presence of client-server reconnections, ZODB storage drivers can partly skip notifying client with detailed invalidation messages: zopefoundation#291 (comment) A fix to that issue (zopefoundation#307) proposed to change back to query storage for lastTransaction on every Connection refresh, but to implement careful merging of lastTransaction result and data from invalidation channel. However it was found that the "careful merging" can work correctly only if we require from storage drivers a particular ordering of invalidation events wrt lastTransaction return and result: zopefoundation#307 (comment) While ZEO was already complying with that requirements, NEO had to be fixed to support that: zopefoundation#307 (comment) https://lab.nexedi.com/nexedi/neoppod/commit/a7d101ec https://lab.nexedi.com/nexedi/neoppod/commit/96a5c01f Patch details ------------- We change Connection._txn_time to be a "before" for the database state to which Connection view corresponds. This state is hooked to be initialized and updated in Connection._flush_invalidations - the function that is called from both explicit Connection (re)open and at transaction boundaries via Connection.afterCompletion hook. Objects loading is factored into Connection._load which replaces old "load + check invalidated + fallback to loadBefore" game in Connection._setstate. Connection.open now calls Connection._flush_invalidations unconditionally - even if it was global cache reset event - because besides invalidation flushes the latter is now responsible for querying storage lastTransaction. TmpStore - a "storage" that provides runtime support for savepoints - is refactored correspondingly to delegate loading of original objects back to underlying Connection. DB.close is modified - similarly to ZODB5 - to release DB's Connections carefully with preventing connections from DB poll from implicitly starting new transactions via afterCompletion hook. ZODB.nxd_patches is introduced to indicate to client software that this particular patch is present and can be relied upon. Tests are updated correspondingly. In 227953b Jim talks about converting many tests - because "Lots of tests didn't clean up databases and connections properly" and because new MVCC approach "makes database and connection hygiene a bit more important, especially for tests, because a connection will continue to interact with storages if it isn't properly closed, which can lead to errors if the storage is closed." but finally implementing automatic cleanup at transaction boundaries because there are too many tests to fix. We backport only automatic cleanup + necessary explicit test fixes to keep the diff minimal. All tests pass. This includes tests for ZODB itself, ZEO and NEO test over hereby modified ZODB(*), my test programs from zopefoundation#290 and zopefoundation/ZEO#155 and ERP5 tests. Upcoming wendelin.core 2 also work with this change. (*) ZEO, NEO and ERP5 tests fail sometimes, but there is no regression here because ZEO, NEO and ERP5 tests are failing regularly, and in the same way, even with unmodified ZODB. Appendix. zconn_at ------------------ This appendix provides motivation for the backport: For wendelin.core v2 we need a way to know at which particular database state application-level ZODB connection is viewing the database. Knowing that state, WCFS client library interacts with WCFS filesystem server and, in simple terms, requests the server to provide data as of that particular database state. Let us call the function that for a client ZODB connection returns database state corresponding to its database view zconn_at. Now here is the problem: zconn_at is relatively easy to implement for ZODB5 - see e.g. here: https://lab.nexedi.com/nexedi/wendelin.core/blob/v0.13-54-ga6a8f5b/lib/zodb.py#L142-181 https://lab.nexedi.com/nexedi/wendelin.core/commit/3bd82127 however, for ZODB4, since its operational models is not directly MVCC, it is not that straightforward. Still, even for older ZODB4, for every client connection, there _is_ such at that corresponds to that connection view of the database. We need ZODB4 support, because ZODB4 is currently the version that Nexedi uses, and my understanding is that it will stay like this for not a small time. I have the feeling that ZODB5 was reworked in better direction, but without caring enough about quality which resulted in concurrency bugs with data corruption effects like zopefoundation#290 zopefoundation/ZEO#155 etc. Even though the first one is now fixed (but it broke other parts and so both ZODB had to be fixed again _and_ NEO had to be fixed for that ZODB fix to work currently), I feel that upgrading into ZODB5 for Nexedi will require some non-negligible amount of QA work, and thus it is better if we move step-by-step - even if we eventually upgrade to ZODB5 - it is better we first migrate wendelin.core 1 -> wendelin.core 2 with keeping current version of ZODB. Now please note what would happen if zconn_at gives, even a bit, wrong answer: wcfs client will ask wcfs server to provide array data as of different database state compared to current on-client ZODB connection. This will result in that data accessed via ZBigArray will _not_ correspond to all other data accessed via regular ZODB mechanism. It is, in other words, a data corruptions. In some scenarios it can be harmless, but generally it is the worst that can happen to a database. It is good to keep in mind ZODB issue290 when imagining corner cases that zconn_at has to deal with. Even though that issue is ZODB5 only, it shows what kind of bugs it can be in zconn_at implementation for ZODB4. Just for the reference: in Wendelin-based systems there is usually constant stream of database ingestions coming from many places simultaneously. Plus many activities crunching on the DB at the same time as well. And the more clients a system handles, the more there will be level-of-concurrency increase. This means that the problem of correctly handling concurrency issues in zconn_at is not purely theoretical, but has direct relation to our systems. -------- With this backport, zconn_at for ZODB4 becomes trivial and robust to implement: https://lab.nexedi.com/kirr/wendelin.core/blob/484071b3/lib/zodb.py#L183-195 I would like to thank Joshua Wölfel whose internship helped this topic to shape up: https://www.erp5.com/project_section/wendelin-ia-project/forum/Joshua-internship-D8b7NNhWfz /cc @Nexedi, @jwolf083 Signed-off-by: Kirill Smelkov <[email protected]>
The change has caused a problem with ZEO: "zopefoundation/ZEO#166". The reason is obvious: In "zopefoundation/ZEO#167" I tried to change this but failed: the (ZODB) @jmuchemb Do you have an idea how this problem might be fixed? |
|
Julien Muchembled wrote at 2021-4-8 07:00 -0700:
> In "[zopefoundation/ZEO#167](zopefoundation/ZEO#167)" I tried to change this but failed: the (ZODB) `check_tid_ordering_w_commit` tests are now failing. This test uses the `tpc_finish` callback used by `mvccadapter` for invalidation processing to ensure that `lastTransaction` is called at this time. As `lastTransaction` now changes after invalidation processing, the test sees a value which is deemed too old.
>
> @jmuchemb Do you have an idea how this problem might be fixed?
`check_tid_ordering_w_commit` was fixed in zopefoundation/ZODB#316 but no new release of ZODB since then.
The `check_tid_ordering_w_commit` tests fail (for ZEO#167) even
with the current ZODB master (which contains `ZODB#316`).
This does not surprise me: the test samples `lastTransaction`
at the precise place where during normal operations `MVCCAdapterInstance`
would process invalidations.
As (with ZODB>=5.6) `lastTransaction` must change after invalidation
processing, it has not yet changed at this time and the test must fail.
I believe that the original problem should be resolved in a different
way: Let `MVCCAdapterInstance` override `lastTransaction` to return
its `_ltid` (which is the tid of the last processed invalidation)
and call the base `lastTransaction` if this is still invalid.
There could still be a race condition - but only for newly created
`MVCCAdapterInstance`s or after reconnections. I hope that
this could be handle in `poll_invalidations`.
I will explore that approach.
|
You're not clear about where the bug would be. And if the code of ZODB is correct, I prefer not to complicate just to make the test pass. I don't understand what you suggest to do in Note that
In NEO, we can then do several operations without releasing the lock. And in particular, as you can see within tpc_finish, the callback call & the last_tid update are done with the same acquired lock. The test starts with a load and NEO also acquires the cache lock for that: it means that when the load is processed, the last_tid is guaranteed to be updated (NEO needed that with ZODB<5). For ZEO, I tried the following change (on top of zopefoundation/ZEO#167): --- a/src/ZEO/asyncio/client.py
+++ b/src/ZEO/asyncio/client.py
@@ -661,6 +661,7 @@ def tpc_finish_threadsafe(self, future, wait_ready, tid, updates, f):
self.protocol.close()
self.disconnected(self.protocol)
else:
+ with cache._lock:
# typically performs connection cache invalidations
f(tid)
# ZODB>=5.6.0 requires that ``LastTid`` and the test passes. But what I said about ZODB is also true for ZEO: if ZEO is already correct about how it locks the cache, then we should rather change the test. |
Julien Muchembled wrote at 2021-4-9 11:09 -0700:
You're not clear about where the bug would be. And if the code of ZODB is correct, I prefer not to complicate just to make the test pass. I don't understand what you suggest to do in `MVCCAdapterInstance` because `check_tid_ordering_w_commit` works directly with storages.
`MVCCAdapterInstance` puts the invalidations associated with
`tpc_finish` into the callback to the storage's `tpc_finish`.
The storage interface now (ZODB>=5.6) requests that `lastTransaction`
changes **after** invalidation processing. For `ClientStorage`,
this implies that `cache.setLastTID` (the source of `lastTransaction`)
must be called after the callback call.
In the `check_tid_ordering_w_commit`, this same callback is used
to coordinate `lastTransaction` sampling. Therefore, it necessarily
gets the wrong `lastTransaction`: the one before the `tpc_finish`
not the one set up by `tpc_finish`.
Do you now see the problem?
If `lastTransaction` must change after invalidation processing,
`check_tid_ordering_w_commit` cannot see the correct
`lastTransaction` value because it monitors this value at the
place where during normal operations the invalidation takes place.
Note that `check_tid_ordering_w_commit` passes with NEO. The difference between NEO & ZEO is about the cache lock:
- NEO: the lock is outside the cache instance and the caller must acquire it before accessing the cache
I perfectly understand why the test fails with `ZEO`.
I do not understand why it succeeds with any storage which
is used via `MVCCAdapter`. Or more precisely: if the test
succeeds, the storage would need to change `lastTransaction`
**before** invalidation processing -- contrary to the new requirement
of ZODB>=5.6.
...
For ZEO, I tried the following change (on top of zopefoundation/ZEO#167):
```diff
--- a/src/ZEO/asyncio/client.py
+++ b/src/ZEO/asyncio/client.py
@@ -661,6 +661,7 @@ def tpc_finish_threadsafe(self, future, wait_ready, tid, updates, f):
self.protocol.close()
self.disconnected(self.protocol)
else:
+ with cache._lock:
# typically performs connection cache invalidations
f(tid)
# ZODB>=5.6.0 requires that ``LastTid``
```
and the test passes.
Really surprising: how can the test pass when the test setup
tries hard to ensure that `lastTransaction` is sampled during the `f(tid)` call
and `setLastTID` (which provides the value for `lastTransaction`)
is called only thereafter?
But what I said about ZODB is also true for ZEO: if ZEO is already correct about how it locks the cache, then we should rather change the test.
As I read the test, it depends on the fact that
`lastTransaction` changes before invalidation processing -- at least
for all storages which are used with `MVCCAdapter` -- contrary to
the new (ZODB>=5.6) ordering requirement.
|
@d-maurer, I quickly checked, and while I might be missing something, to me it looks like there should not be the problem you describe with ZODB/src/ZODB/tests/BasicStorage.py Lines 321 to 379 in dad7780
Existing storages already confirm to the requiredment to first call the tpc_finish callback, and only after that to set the last tid. And they indeed do so from under holding the "storage lock". Here are some examples: BaseStorage: Lines 230 to 246 in dad7780
FileStorage: ZODB/src/ZODB/FileStorage/FileStorage.py Lines 810 to 821 in dad7780
MappingStorage ZODB/src/ZODB/MappingStorage.py Lines 286 to 321 in dad7780
NEO: https://lab.nexedi.com/nexedi/neoppod/blob/de0feb4e/neo/client/handlers/master.py#L69-86 Doing last tid update from under lock (and retrieving last tid with also taking the lock) is likely what @jimfulton was talking about "some locking strategy" in c3091e2. Does that help? Did I missed something? Thanks again for your work on to bring ZEO5 into a better shape, |
Kirill Smelkov wrote at 2021-4-13 10:48 -0700:
***@***.***, I quickly checked, and while I might be missing something, to me it looks like there should not be the problem you describe with `check_tid_ordering_w_commit`: there both `commit` (that calls storage.tpc_finish with callback to sample new tid) and `load` (that samples new serial and `lastTransaction`) are run in their own threads in parallel. The logic here is that while tpc_finish callback is run - it is holding a kind of "storage lock". This way thread that calls lastTransaction must be blocked to wait for the tpc_finish thread to first release that lock. And after the storage lock is released tpc_finish is complete and so the notion of last tid is already updated too.
I am convinced: I have misread the test that `finish` is set
(and therefore `tpc_finish` can complete) after
the sampling threads have run; but in fact, `finish` is set after
all sampling threads are started. The "storage" lock (not visible
by looking at the test only) apparently prevents the sampling
threads to complete before `tpc_finish` ends.
But for ZEO5, `ClientStorage` seems to lack such a lock.
...
Does that help? Did I missed something?
It does. We still need to find a way to implement the lock
for `ClientStorage` in ZEO5. There, the test fails when I move
the `setLastTID` call after the callback.
|
Thanks, @d-maurer. |
Since commit b5895a5, a ZEO test fails as follows:
Earlier in the test, the ZEO server is restarted and then another client commits. When disconnected, the first client does not receive invalidations anymore and the connection gets stuck in the past until there's a new commit after it reconnected. It was possible to make the test pass with the following patch:
Other implementations like NEO are probably affected the same way.
Rather than changing interfaces in a backward-incompatible way, this commit revert to the original behaviour, and all the changes that were done in existing tests are reverted.