Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rework on serialization and deserialization in pagination #1498

Merged

Conversation

Yury-Fridlyand
Copy link
Collaborator

Supersedes Bit-Quill#252

Description

Rework PhysicalPlan tree serialization and deserialization. Make it scalable and extendable.
If a new plan should be serialized (a new feature is becoming supported by pagination) - it should properly implement SerializablePlan methods.
The plan tree will be recovered in deserialization the same way it was before serialization.
New mechanism can also skip plans from serialization - for example, it skips ResourceMonitorPlan and PaginateOperator.

This change is required for Pagination Phase 2 features.

Code in this PR is based on #1497/#1483.

Issues Resolved

N/A

Check List

  • New functionality includes testing.
    • All tests pass, including unit test, integration test and doctest
  • New functionality has been documented.
    • New functionality has javadoc added
    • New functionality has user manual doc added
  • Commits are signed per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@Yury-Fridlyand Yury-Fridlyand force-pushed the feature/pagination/rework-serialization branch from c22d8c2 to 117e889 Compare April 5, 2023 20:34
@codecov-commenter
Copy link

codecov-commenter commented Apr 5, 2023

Codecov Report

Merging #1498 (9f04686) into feature/pagination/P1 (b9cb0d0) will decrease coverage by 0.01%.
The diff coverage is 100.00%.

📣 This organization is not using Codecov’s GitHub App Integration. We recommend you install it so Codecov can continue to function properly for your repositories. Learn more

@@                     Coverage Diff                     @@
##             feature/pagination/P1    #1498      +/-   ##
===========================================================
- Coverage                    98.51%   98.51%   -0.01%     
+ Complexity                    4013     3995      -18     
===========================================================
  Files                          361      363       +2     
  Lines                         9905     9879      -26     
  Branches                       643      633      -10     
===========================================================
- Hits                          9758     9732      -26     
  Misses                         142      142              
  Partials                         5        5              
Flag Coverage Δ
sql-engine 98.51% <100.00%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
...java/org/opensearch/sql/storage/StorageEngine.java 100.00% <ø> (ø)
...nsearch/data/value/OpenSearchExprValueFactory.java 100.00% <ø> (ø)
...opensearch/request/ContinuePageRequestBuilder.java 100.00% <ø> (ø)
.../opensearch/request/InitialPageRequestBuilder.java 100.00% <ø> (ø)
...rch/sql/opensearch/setting/OpenSearchSettings.java 100.00% <ø> (ø)
...ql/opensearch/storage/OpenSearchStorageEngine.java 100.00% <ø> (ø)
...ge/scan/OpenSearchIndexScanAggregationBuilder.java 100.00% <ø> (ø)
.../storage/scan/OpenSearchIndexScanQueryBuilder.java 100.00% <ø> (ø)
...nsearch/storage/script/ExpressionScriptEngine.java 100.00% <ø> (ø)
...ge/script/aggregation/AggregationQueryBuilder.java 100.00% <ø> (ø)
... and 14 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

assertEquals(input, decompress(compressed));
@ValueSource(strings = {"pewpew", "asdkfhashdfjkgakgfwuigfaijkb", "ajdhfgajklghadfjkhgjkadhgad"
+ "kadfhgadhjgfjklahdgqheygvskjfbvgsdklgfuirehiluANUIfgauighbahfuasdlhfnhaughsdlfhaughaggf"
+ "and_some_other_funny_stuff_which_could_be_generated_while_sleeping_on_the_keyboard"})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😆

Comment on lines 68 to 72
GZIPOutputStream gzip = new GZIPOutputStream(out) { {
this.def.setLevel(Deflater.BEST_COMPRESSION);
} };
gzip.write(output.toByteArray());
gzip.close();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use try-with-resources (try (... gzip = new ...) { .. }) here so that gzip is closed even if exception is thrown.

Call to close will become unnecessary.

https://docs.oracle.com/javase/tutorial/essential/exceptions/tryResourceClose.html

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This causes java.io.EOFException: Unexpected end of ZLIB input stream.

  protected String serialize(Serializable object) throws NotSerializableException {
    try {
      try (ByteArrayOutputStream output = new ByteArrayOutputStream();
          ObjectOutputStream objectOutput = new ObjectOutputStream(output)) {
        objectOutput.writeObject(object);
        objectOutput.flush();

        try (ByteArrayOutputStream out = new ByteArrayOutputStream();
            // GZIP provides 35-45%, lzma from apache commons-compress has few % better compression
            GZIPOutputStream gzip = new GZIPOutputStream(out) {
              {
                this.def.setLevel(Deflater.BEST_COMPRESSION);
              }
            }) {
          gzip.write(output.toByteArray());
          return HashCode.fromBytes(out.toByteArray()).toString();
        }
      }
    } catch (NotSerializableException e) {
      throw e;
    } catch (IOException e) {
      throw new IllegalStateException("Failed to serialize: " + object, e);
    }
  }

Comment on lines 61 to 63
ByteArrayOutputStream output = new ByteArrayOutputStream();
ObjectOutputStream objectOutput = new ObjectOutputStream(output);
objectOutput.writeObject(object);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both these streams are Closable. Good place to use try-with-resources.

@Yury-Fridlyand Yury-Fridlyand force-pushed the feature/pagination/rework-serialization branch from 117e889 to 9f04686 Compare April 14, 2023 22:07
@Yury-Fridlyand Yury-Fridlyand merged commit 9a1a17c into feature/pagination/P1 Apr 14, 2023
@Yury-Fridlyand Yury-Fridlyand deleted the feature/pagination/rework-serialization branch April 14, 2023 22:38
Yury-Fridlyand added a commit that referenced this pull request Apr 27, 2023
* Support pagination in V2 engine, phase 1 (#226)

* Fixing integration tests broken during POC

Signed-off-by: MaxKsyunz <[email protected]>

* Comment to clarify an exception.

Signed-off-by: MaxKsyunz <[email protected]>

* Add support for paginated scroll request, first page.

Implement PaginatedPlanCache.convertToPlan for second page to work.

Signed-off-by: MaxKsyunz <[email protected]>

* Progress on paginated scroll request, subsequent page.

Signed-off-by: MaxKsyunz <[email protected]>

* Move `ExpressionSerializer` from `opensearch` to `core`.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Rename `Cursor` `asString` to `toString`.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Disable scroll cleaning.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Add full cursor serialization and deserialization.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Misc fixes.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Further work on pagination.

* Added push down page size from `LogicalPaginate` to `LogicalRelation`.
* Improved cursor encoding and decoding.
* Added cursor compression.
* Fixed issuing `SearchScrollRequest`.
* Fixed returning last empty page.
* Minor code grooming/commenting.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Pagination fix for empty indices.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Fix error reporting on wrong cursor.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Minor comments and error reporting improvement.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Add an end-to-end integration test.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Add `explain` request handlers.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Add IT for explain.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Address issues flagged by checkstyle build step (#229)

Signed-off-by: MaxKsyunz <[email protected]>

* Pagination, phase 1: Add unit tests for `:core` module with coverage. (#230)

* Add unit tests for `:core` module with coverage. Uncovered: `toCursor`, because it is will be changed soon.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Pagination, phase 1: Add unit tests for SQL module with coverage. (#239)

* Add unit tests for SQL module with coverage.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Update sql/src/main/java/org/opensearch/sql/sql/domain/SQLQueryRequest.java

Signed-off-by: Yury-Fridlyand <[email protected]>

Co-authored-by: GabeFernandez310 <[email protected]>

---------

Signed-off-by: Yury-Fridlyand <[email protected]>
Co-authored-by: GabeFernandez310 <[email protected]>

* Pagination, phase 1: Add unit tests for `:opensearch` module with coverage. (#233)

* Add UT for `:opensearch` module with full coverage, except `toCursor`.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Fix checkstyle.

Signed-off-by: Yury-Fridlyand <[email protected]>

---------

Signed-off-by: Yury-Fridlyand <[email protected]>

* Fix the merges.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Fix explain.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Fix scroll cleaning.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Store `TotalHits` and use it to report `total` in response.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Add missing UT for `:protocol` module.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Fix PPL UTs damaged in f4ea4ad.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Minor checkstyle fixes.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Fallback to v1 engine for pagination (#245)

* Pagination fallback integration tests.

Signed-off-by: MaxKsyunz <[email protected]>

* Add UT with coverage for `toCursor` serialization.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Fix broken tests in `legacy`.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Fix getting `total` from non-paged requests and from queries without `FROM` clause.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Fix scroll cleaning.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Fix cursor request processing.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Update ITs.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Fix (again) TotalHits feature.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Fix typo in prometheus config.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Recover commented logging.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Move `test_pagination_blackbox` to a separate class and add logging.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Address some PR feedbacks: rename some classes and revert unnecessary whitespace changed.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Minor commenting.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Address PR comments.

* Add javadocs
* Renames
* Cleaning up some comments
* Remove unused code
* Speed up IT

Signed-off-by: Yury-Fridlyand <[email protected]>

* Minor missing changes.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Integration tests for fetch_size, max_result_window, and query.size_limit (#248)

Signed-off-by: MaxKsyunz <[email protected]>

* Remove `PaginatedQueryService`, extend `QueryService` to hold two planners and use them.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Move push down functions from request builders to a new interface.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Some file moves.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Minor clean-up according to PR review.

Signed-off-by: Yury-Fridlyand <[email protected]>

---------

Signed-off-by: MaxKsyunz <[email protected]>
Signed-off-by: Yury-Fridlyand <[email protected]>
Co-authored-by: MaxKsyunz <[email protected]>
Co-authored-by: GabeFernandez310 <[email protected]>
Co-authored-by: Max Ksyunz <[email protected]>

* Make scroll timeout configurable.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Fix IT to set cursor keep alive parameter.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Remove `QueryId.None`.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Rename according to PR feedback.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Remove default implementations of `PushDownRequestBuilder`.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Merge paginated plan optimizer into the regular optimizer. (#1516)

Merge paginated plan optimizer into the regular optimizer.
---------

Signed-off-by: MaxKsyunz <[email protected]>
Co-authored-by: Yury-Fridlyand <[email protected]>

* Complete rework on serialization and deserialization. (#1498)

Signed-off-by: Yury-Fridlyand <[email protected]>

* Resolve merge conflicts and fix tests.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Minor cleanup.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Minor cleanup - missing changes for the previous commit.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Remove paginate operator  (#1528)

* Remove PaginateOperator class since it is no longer used.


---------

Signed-off-by: MaxKsyunz <[email protected]>

* Remove `PaginatedPlan` - move logic to `QueryPlan`.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Remove default implementations from `SerializablePlan`.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Add a doc.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Update design graphs.

Signed-off-by: Yury-Fridlyand <[email protected]>

* More fixes for merge from upstream/main.

Signed-off-by: MaxKsyunz <[email protected]>

---------

Signed-off-by: MaxKsyunz <[email protected]>
Signed-off-by: Yury-Fridlyand <[email protected]>
Co-authored-by: MaxKsyunz <[email protected]>
Co-authored-by: GabeFernandez310 <[email protected]>
Co-authored-by: Max Ksyunz <[email protected]>
acarbonetto pushed a commit to Bit-Quill/opensearch-project-sql that referenced this pull request Apr 28, 2023
* Support pagination in V2 engine, phase 1 (#226)

* Fixing integration tests broken during POC

Signed-off-by: MaxKsyunz <[email protected]>

* Comment to clarify an exception.

Signed-off-by: MaxKsyunz <[email protected]>

* Add support for paginated scroll request, first page.

Implement PaginatedPlanCache.convertToPlan for second page to work.

Signed-off-by: MaxKsyunz <[email protected]>

* Progress on paginated scroll request, subsequent page.

Signed-off-by: MaxKsyunz <[email protected]>

* Move `ExpressionSerializer` from `opensearch` to `core`.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Rename `Cursor` `asString` to `toString`.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Disable scroll cleaning.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Add full cursor serialization and deserialization.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Misc fixes.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Further work on pagination.

* Added push down page size from `LogicalPaginate` to `LogicalRelation`.
* Improved cursor encoding and decoding.
* Added cursor compression.
* Fixed issuing `SearchScrollRequest`.
* Fixed returning last empty page.
* Minor code grooming/commenting.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Pagination fix for empty indices.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Fix error reporting on wrong cursor.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Minor comments and error reporting improvement.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Add an end-to-end integration test.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Add `explain` request handlers.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Add IT for explain.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Address issues flagged by checkstyle build step (#229)

Signed-off-by: MaxKsyunz <[email protected]>

* Pagination, phase 1: Add unit tests for `:core` module with coverage. (#230)

* Add unit tests for `:core` module with coverage. Uncovered: `toCursor`, because it is will be changed soon.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Pagination, phase 1: Add unit tests for SQL module with coverage. (#239)

* Add unit tests for SQL module with coverage.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Update sql/src/main/java/org/opensearch/sql/sql/domain/SQLQueryRequest.java

Signed-off-by: Yury-Fridlyand <[email protected]>

Co-authored-by: GabeFernandez310 <[email protected]>

---------

Signed-off-by: Yury-Fridlyand <[email protected]>
Co-authored-by: GabeFernandez310 <[email protected]>

* Pagination, phase 1: Add unit tests for `:opensearch` module with coverage. (#233)

* Add UT for `:opensearch` module with full coverage, except `toCursor`.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Fix checkstyle.

Signed-off-by: Yury-Fridlyand <[email protected]>

---------

Signed-off-by: Yury-Fridlyand <[email protected]>

* Fix the merges.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Fix explain.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Fix scroll cleaning.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Store `TotalHits` and use it to report `total` in response.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Add missing UT for `:protocol` module.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Fix PPL UTs damaged in f4ea4ad.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Minor checkstyle fixes.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Fallback to v1 engine for pagination (#245)

* Pagination fallback integration tests.

Signed-off-by: MaxKsyunz <[email protected]>

* Add UT with coverage for `toCursor` serialization.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Fix broken tests in `legacy`.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Fix getting `total` from non-paged requests and from queries without `FROM` clause.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Fix scroll cleaning.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Fix cursor request processing.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Update ITs.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Fix (again) TotalHits feature.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Fix typo in prometheus config.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Recover commented logging.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Move `test_pagination_blackbox` to a separate class and add logging.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Address some PR feedbacks: rename some classes and revert unnecessary whitespace changed.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Minor commenting.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Address PR comments.

* Add javadocs
* Renames
* Cleaning up some comments
* Remove unused code
* Speed up IT

Signed-off-by: Yury-Fridlyand <[email protected]>

* Minor missing changes.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Integration tests for fetch_size, max_result_window, and query.size_limit (#248)

Signed-off-by: MaxKsyunz <[email protected]>

* Remove `PaginatedQueryService`, extend `QueryService` to hold two planners and use them.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Move push down functions from request builders to a new interface.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Some file moves.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Minor clean-up according to PR review.

Signed-off-by: Yury-Fridlyand <[email protected]>

---------

Signed-off-by: MaxKsyunz <[email protected]>
Signed-off-by: Yury-Fridlyand <[email protected]>
Co-authored-by: MaxKsyunz <[email protected]>
Co-authored-by: GabeFernandez310 <[email protected]>
Co-authored-by: Max Ksyunz <[email protected]>

* Make scroll timeout configurable.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Fix IT to set cursor keep alive parameter.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Remove `QueryId.None`.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Rename according to PR feedback.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Remove default implementations of `PushDownRequestBuilder`.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Merge paginated plan optimizer into the regular optimizer. (opensearch-project#1516)

Merge paginated plan optimizer into the regular optimizer.
---------

Signed-off-by: MaxKsyunz <[email protected]>
Co-authored-by: Yury-Fridlyand <[email protected]>

* Complete rework on serialization and deserialization. (opensearch-project#1498)

Signed-off-by: Yury-Fridlyand <[email protected]>

* Resolve merge conflicts and fix tests.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Minor cleanup.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Minor cleanup - missing changes for the previous commit.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Remove paginate operator  (opensearch-project#1528)

* Remove PaginateOperator class since it is no longer used.

---------

Signed-off-by: MaxKsyunz <[email protected]>

* Remove `PaginatedPlan` - move logic to `QueryPlan`.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Remove default implementations from `SerializablePlan`.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Add a doc.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Update design graphs.

Signed-off-by: Yury-Fridlyand <[email protected]>

* More fixes for merge from upstream/main.

Signed-off-by: MaxKsyunz <[email protected]>

---------

Signed-off-by: MaxKsyunz <[email protected]>
Signed-off-by: Yury-Fridlyand <[email protected]>
Co-authored-by: MaxKsyunz <[email protected]>
Co-authored-by: GabeFernandez310 <[email protected]>
Co-authored-by: Max Ksyunz <[email protected]>
@Yury-Fridlyand Yury-Fridlyand added the pagination Pagination feature, ref #656 label May 1, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pagination Pagination feature, ref #656
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants