-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: Minor update to DictionaryVector toString #12025
base: main
Are you sure you want to change the base?
Conversation
This pull request was exported from Phabricator. Differential Revision: D67870420 |
✅ Deploy Preview for meta-velox canceled.
|
velox/expression/tests/ExprTest.cpp
Outdated
// log vector | ||
for (const auto& child : rowVector->children()) { | ||
LOG(INFO) << child->toString(/*recursive=*/true) << std::endl; | ||
} | ||
for (vector_size_t i = 0; i < rowVector->size(); ++i) { | ||
LOG(INFO) << rowVector->toString(i) << std::endl; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I copied this code from an existing utility method, so it calls toString of both child and rowVector. In your unit test, you probably only need one of these loops to trigger the error.
192916c
to
5cc20dd
Compare
Summary: Fixes error raised in Github issue 10594 (facebookincubator#10594) Error details blindspot in toString function for lazy vectors where they are able to be logged; however they still show as unitialized as setInternalState was never executed. I propose that function loadedVector be called to start the method as it will quit early for initialized vectors and serve virtually no additional overhead, while unloaded vectors will be properly initialized Differential Revision: D67870420
This pull request was exported from Phabricator. Differential Revision: D67870420 |
velox/expression/tests/ExprTest.cpp
Outdated
rowVector->toString(i); | ||
} | ||
|
||
// verify outer dictionary layers are properly initialized |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: verify -> Verify. Also, please add a period at the end of the comment.
velox/vector/DictionaryVector.h
Outdated
@@ -186,6 +186,7 @@ class DictionaryVector : public SimpleVector<T> { | |||
} | |||
|
|||
std::string toString(vector_size_t index) const override { | |||
loadedVector(); // load lazy vector if needed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Please move the comment to a separate line above the code and capitalize the first character and add a period at the end (because comments are supposed to be sentences).
@bikramSingh91, could you take a look too? Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@peterenescu while this would work, it will have the side effect of loading the vector which can be a cost when used in production code or can change the conditions of a unit test rendering it ineffective. In that case, we would have to make sure that any test that might rely on these conditions retains them if it is also using toString() in any capacity (like in ExpressionVerifier).
I discussed this offline with Wei, and we agreed that it would be better to return an exception if toString() is called on an unloaded vector. Since the primary use case is to use toString() for debugging, this would make it obvious to the user what is happening here instead of silently loading the vector.
velox/expression/tests/ExprTest.cpp
Outdated
@@ -4942,6 +4942,30 @@ TEST_F(ExprTest, disableMemoization) { | |||
ASSERT_EQ(stats["plus"].numProcessedRows, 3 * flatSize); | |||
} | |||
|
|||
TEST_F( | |||
ExprTest, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the change is limited to DictionaryVector and the effects can be easily testes via its API, testing it via expression eval would be an overkill. I would recommend writing a test in VectorTest instead
velox/vector/DictionaryVector.h
Outdated
@@ -186,6 +186,7 @@ class DictionaryVector : public SimpleVector<T> { | |||
} | |||
|
|||
std::string toString(vector_size_t index) const override { | |||
loadedVector(); // load lazy vector if needed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@peterenescu while this would work, it will have the side effect of loading the vector which can be a cost when used in production code or can change the conditions of a unit test rendering it ineffective. In that case, we would have to make sure that any test that might rely on these conditions retains them if it is also using toString() in any capacity (like in ExpressionVerifier).
I discussed this offline with Wei, and we agreed that it would be better to return an exception if toString() is called on an unloaded vector. Since the primary use case is to use toString() for debugging, this would make it obvious to the user what is happening here instead of silently loading the vector.
5cc20dd
to
70a2569
Compare
Summary: Fixes error raised in Github issue 10594 (facebookincubator#10594) Error details blindspot in toString function for lazy vectors where they are able to be logged; however they still show as unitialized as setInternalState was never executed. I propose that function loadedVector be called to start the method as it will quit early for initialized vectors and serve virtually no additional overhead, while unloaded vectors will be properly initialized Differential Revision: D67870420
This pull request was exported from Phabricator. Differential Revision: D67870420 |
…ng uninitialized lazy loaded vectors (facebookincubator#12025) Summary: Problem: There exists a blindspot in DictionaryVector's toString function, where lazy-loaded or otherwise unitialized vectors will be properly logged when calling toString, but may fail downstream elsewhere like during an expression eval because, although the logging suggests it was loaded, it was never properly intialized. Solution: After discussing with Wei and Bikram, we should add logic to confirm the vector is loaded before and error out with a detailed error message before any string creating done. This will notify the user the vector is not-loaded and should be properly loaded before any logging is done. Added additional test function to VectorTest class to cover this case. (fixes facebookincubator#10594) Differential Revision: D67870420
70a2569
to
bb5952e
Compare
This pull request was exported from Phabricator. Differential Revision: D67870420 |
…ng uninitialized lazy loaded vectors (facebookincubator#12025) Summary: Problem: There exists a blindspot in DictionaryVector's toString function, where lazy-loaded or otherwise unitialized vectors will be properly logged when calling toString, but may fail downstream elsewhere like during an expression eval because, although the logging suggests it was loaded, it was never properly initialized. Solution: After discussing with Wei and Bikram, we should add logic to confirm the vector is loaded before and error out with a detailed error message before any string creating done. This will notify the user the vector is not-loaded and should be properly loaded before any logging is done. Added additional test function to VectorTest class to cover this case. (fixes facebookincubator#10594) Differential Revision: D67870420
bb5952e
to
4c4d106
Compare
This pull request was exported from Phabricator. Differential Revision: D67870420 |
Summary:
Fixes error raised in Github issue 10594 (#10594)
Error details blindspot in toString function for lazy vectors where they are able to be logged; however they still show as unitialized as setInternalState was never executed.
I propose that function loadedVector be called to start the method as it will quit early for initialized vectors and serve virtually no additional overhead, while unloaded vectors will be properly initialized
Differential Revision: D67870420