Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CXXCBC-606: Fix detection of disfunctional node #673

Conversation

avsej
Copy link
Member

@avsej avsej commented Oct 9, 2024

  • remove reopen() API of the stream, the close is asynchronous anyway, so it is not correct to assume that the socket will be ready immediately.
  • remove open_ flag for stream, and instead proxy it directly to the socket to avoid "operation is already in progress" type of errors when trying to open the same socket twice.
  • re-arm connection_deadline_ when the socket is ready for data, use KV timeout for whole handshake with the single node. Re-initiate bootstrap using next bootstrap address in case of single node timeout.
  • reset reading_ flag in case of error. Ensure that the flag is reset if the session is not reading.
  • handle socket_closed_while_in_flight error, and if it happens during bootstrap, try next node in the list, otherwise preserve old behavior (close the connection, and let bucket to reopen it).

@avsej avsej requested review from thejcfactor and a team October 9, 2024 05:36
@avsej avsej force-pushed the CXXCBC-606-fix-bootstrap-with-broken-but-responsive-node branch 4 times, most recently from f794095 to a5cae8c Compare October 10, 2024 16:12
* remove "reopen()" API of the stream, the close is asynchronous
  anyway, so it is not correct to assume that the socket will be ready
  immediately.

* remove open_ flag for stream, and instead proxy it directly to the
  socket to avoid "operation is already in progress" type of errors when
  trying to open the same socket twice.

* re-arm connection_deadline_ when the socket is ready for data, use KV
  timeout for whole handshake with the single node. Re-initiate
  bootstrap using next bootstrap address in case of single node timeout.

* reset reading_ flag in case of error. Ensure that the flag is reset if
  the session is not reading.

* handle socket_closed_while_in_flight error, and if it happens during
  bootstrap, try next node in the list, otherwise preserve old behavior
  (close the connection, and let bucket to reopen it).
@avsej avsej force-pushed the CXXCBC-606-fix-bootstrap-with-broken-but-responsive-node branch from a5cae8c to 9cac3be Compare October 11, 2024 18:11
Copy link
Contributor

@thejcfactor thejcfactor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍

Tested changes in Node.js client as well.

@avsej avsej merged commit 7af2c94 into couchbase:main Oct 11, 2024
17 of 27 checks passed
@avsej avsej deleted the CXXCBC-606-fix-bootstrap-with-broken-but-responsive-node branch October 11, 2024 19:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants