Skip to content

Commit

Permalink
[#21229] YSQL: Validate pggate capability before sending scanned rows.
Browse files Browse the repository at this point in the history
Summary:
It has been reported that upgrades from 2.14/2.16 to 2.20 (and beyond) fail due to pggate going into a crash loop
upon unpacking `PgsqlResponsePB`. This is caused due to the introduction of the 'Scanned Rows' field in 2.20+ (D31111)
which is sent in its own RPC metrics sidecar.
Versions of pggate lower than 2.17.1 are not capable of unpacking response protos that contain RPC
sidecars holding data other than the rows returned by DocDB. During an upgrade, while an un-upgraded pggate
may send a request only to its local un-upgraded tserver, it may have responses proxied back from upgraded
tservers on other nodes. Thus, the RPC infrastructure needs to be forward compatible in order to ensure that
pggate is not broken during upgrades.

This revision introduces a guardrail to check that the receiving pggate is capable of unpacking the RPC metrics sidecar
before sending the 'Scanned Rows' count.

Required backports: 2.20 (original diff + this fix), 2024.1 (only this fix), 2.21 (if needed, depending on branching)
Jira: DB-10156

Test Plan: Run rolling upgrade itest from 2.14/2.16 to master, 2.20.x, 2.21.x, 2024.1

Reviewers: hsunder, esheng, sergei

Reviewed By: hsunder

Subscribers: ybase, yql, mihnea, smishra

Tags: #jenkins-ready

Differential Revision: https://phorge.dev.yugabyte.com/D33503
  • Loading branch information
karthik-ramanathan-3006 committed Mar 26, 2024
1 parent bbfd0af commit 80dc997
Showing 1 changed file with 12 additions and 3 deletions.
15 changes: 12 additions & 3 deletions src/yb/docdb/pgsql_operation.cc
Original file line number Diff line number Diff line change
Expand Up @@ -1639,10 +1639,19 @@ Result<size_t> PgsqlReadOperation::Execute(
restart_read_ht->MakeAtLeast(VERIFY_RESULT(index_iter_->RestartReadHt()));
}

response_.mutable_metrics()->set_scanned_table_rows(scanned_table_rows_);
if (scanned_index_rows_ > 0) {
response_.mutable_metrics()->set_scanned_index_rows(scanned_index_rows_);
// Versions of pggate < 2.17.1 are not capable of processing response protos that contain RPC
// sidecars holding data other than the rows returned by DocDB. During an upgrade, it is possible
// that a pggate of version < 2.17.1 may receive a response from a tserver containing the metrics
// sidecar, causing a crash. To prevent this, send the scanned row count only if the incoming
// request object contains the metrics capture field. This serves to validate that the requesting
// pggate has indeed upgraded to a version that is capable of unpacking the metric sidecar.
if (request_.has_metrics_capture()) {
response_.mutable_metrics()->set_scanned_table_rows(scanned_table_rows_);
if (scanned_index_rows_ > 0) {
response_.mutable_metrics()->set_scanned_index_rows(scanned_index_rows_);
}
}

return fetched_rows;
}

Expand Down

0 comments on commit 80dc997

Please sign in to comment.