Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[plan] support "GROUP BY 1" #17167

Merged
merged 4 commits into from
Jun 27, 2024
Merged

[plan] support "GROUP BY 1" #17167

merged 4 commits into from
Jun 27, 2024

Conversation

aunjgr
Copy link
Contributor

@aunjgr aunjgr commented Jun 26, 2024

User description

What type of PR is this?

  • API-change
  • BUG
  • Improvement
  • Documentation
  • Feature
  • Test and CI
  • Code Refactoring

Which issue(s) this PR fixes:

issue #17150

What this PR does / why we need it:

Not supported:

  • GROUP BY '1'
  • GROUP BY 1 + 2
  • HAVING 1

PR Type

Enhancement, Tests


Description

  • Added support for GROUP BY with integer constants in SQL queries.
  • Modified NewGroupBinder to accept selectList and updated BindExpr to handle integer constants.
  • Updated buildSelect function to use the new NewGroupBinder signature.
  • Extended GroupBinder struct with a new selectList field.
  • Updated test cases to use GROUP BY 1 syntax to validate the new functionality.

Changes walkthrough 📝

Relevant files
Enhancement
group_binder.go
Add support for `GROUP BY` with integer constants               

pkg/sql/plan/group_binder.go

  • Added support for GROUP BY with integer constants.
  • Modified NewGroupBinder to accept selectList.
  • Enhanced BindExpr to handle integer constants in GROUP BY.
  • +21/-1   
    query_builder.go
    Update `buildSelect` to support new `GroupBinder` signature

    pkg/sql/plan/query_builder.go

    • Updated buildSelect to use the new NewGroupBinder signature.
    +2/-3     
    types.go
    Extend `GroupBinder` struct with `selectList` field           

    pkg/sql/plan/types.go

    • Added selectList field to GroupBinder struct.
    +1/-0     
    Tests
    func_aggr_count.result
    Update test cases to use `GROUP BY 1` syntax                         

    test/distributed/cases/function/func_aggr_count.result

    • Modified test cases to use GROUP BY 1 syntax.
    +3/-3     
    func_aggr_count.test
    Update test cases to use `GROUP BY 1` syntax                         

    test/distributed/cases/function/func_aggr_count.test

    • Modified test cases to use GROUP BY 1 syntax.
    +3/-5     

    💡 PR-Agent usage:
    Comment /help on the PR to get a list of all available PR-Agent tools and their descriptions

    Copy link

    PR-Agent was enabled for this repository. To continue using it, please link your git user with your CodiumAI identity here.

    PR Reviewer Guide 🔍

    ⏱️ Estimated effort to review [1-5] 3
    🧪 Relevant tests Yes
    🔒 Security concerns No
    ⚡ Key issues to review Possible Bug:
    The implementation assumes that the GROUP BY integer constant will always correctly map to a column in the select list. This might not always be the case, especially with more complex SQL queries involving functions or subqueries in the select list. It would be beneficial to add more robust error handling and validation to ensure the integer constant refers to a valid column position.
    Error Handling:
    The error message "non-integer constant in GROUP BY" could be more descriptive. It might help to specify what type of constants are allowed and why the current constant is invalid.

    @matrix-meow matrix-meow added the size/S Denotes a PR that changes [10,99] lines label Jun 26, 2024
    @mergify mergify bot added the kind/feature label Jun 26, 2024
    Copy link

    PR-Agent was enabled for this repository. To continue using it, please link your git user with your CodiumAI identity here.

    PR Code Suggestions ✨

    CategorySuggestion                                                                                                                                    Score
    Possible bug
    Add bounds checking to prevent index out of range errors

    Add error handling for index out of range when accessing b.selectList[colPos-1]. This can
    occur if colPos is exactly equal to the length of b.selectList, as slice indices are
    0-based in Go.

    pkg/sql/plan/group_binder.go [47]

    -astExpr = b.selectList[colPos-1].Expr
    +if colPos-1 < len(b.selectList) {
    +    astExpr = b.selectList[colPos-1].Expr
    +} else {
    +    return nil, moerr.NewSyntaxError(b.GetContext(), "GROUP BY position %v is out of range", colPos)
    +}
     
    • Apply this suggestion
    Suggestion importance[1-10]: 10

    Why: This suggestion addresses a critical bug by adding bounds checking to prevent index out of range errors, which can lead to crashes or undefined behavior.

    10
    Ensure the integer conversion is checked for errors to prevent runtime panics or incorrect behavior

    Consider handling the case where constant.Int64Val(numVal.Value) fails to parse the
    integer value correctly. This can happen if the value is out of range for an int64. You
    should check the second return value from constant.Int64Val to ensure it parsed correctly.

    pkg/sql/plan/group_binder.go [42]

    -colPos, _ := constant.Int64Val(numVal.Value)
    +colPos, ok := constant.Int64Val(numVal.Value)
    +if !ok {
    +    return nil, moerr.NewSyntaxError(b.GetContext(), "invalid integer value in GROUP BY")
    +}
     
    • Apply this suggestion
    Suggestion importance[1-10]: 9

    Why: This suggestion addresses a potential bug where the integer conversion might fail, leading to incorrect behavior or runtime panics. Ensuring the conversion is checked for errors is crucial for robustness.

    9
    Maintainability
    Refactor nested conditionals for clarity and maintainability

    Consider refactoring the nested conditionals for better readability and maintainability.
    Using early returns can make the logic clearer and reduce nesting.

    pkg/sql/plan/group_binder.go [38-53]

    -if isRoot {
    -    if numVal, ok := astExpr.(*tree.NumVal); ok {
    -        switch numVal.Value.Kind() {
    -        case constant.Int:
    -            colPos, _ := constant.Int64Val(numVal.Value)
    -            if colPos < 1 || int(colPos) > len(b.selectList) {
    -                return nil, moerr.NewSyntaxError(b.GetContext(), "GROUP BY position %v is not in select list", colPos)
    -            }
    -            astExpr = b.selectList[colPos-1].Expr
    -        default:
    -            return nil, moerr.NewSyntaxError(b.GetContext(), "non-integer constant in GROUP BY")
    -        }
    +if !isRoot {
    +    return b.baseBindExpr(astExpr, depth, isRoot)
    +}
    +numVal, ok := astExpr.(*tree.NumVal)
    +if !ok {
    +    return b.baseBindExpr(astExpr, depth, isRoot)
    +}
    +switch numVal.Value.Kind() {
    +case constant.Int:
    +    colPos, ok := constant.Int64Val(numVal.Value)
    +    if !ok || colPos < 1 || int(colPos) > len(b.selectList) {
    +        return nil, moerr.NewSyntaxError(b.GetContext(), "invalid GROUP BY position %v", colPos)
         }
    +    astExpr = b.selectList[colPos-1].Expr
    +default:
    +    return nil, moerr.NewSyntaxError(b.GetContext(), "non-integer constant in GROUP BY: %v", numVal.Value.String())
     }
     
    • Apply this suggestion
    Suggestion importance[1-10]: 8

    Why: Refactoring nested conditionals for clarity and maintainability improves code readability and reduces complexity, making it easier to understand and maintain.

    8
    Enhancement
    Improve the error message by including the problematic value

    The error message "non-integer constant in GROUP BY" could be enhanced by including the
    actual value that caused the error, providing more context to the user.

    pkg/sql/plan/group_binder.go [50]

    -return nil, moerr.NewSyntaxError(b.GetContext(), "non-integer constant in GROUP BY")
    +return nil, moerr.NewSyntaxError(b.GetContext(), "non-integer constant in GROUP BY: %v", numVal.Value.String())
     
    • Apply this suggestion
    Suggestion importance[1-10]: 7

    Why: Improving error messages by including the problematic value provides better context for debugging and user feedback, enhancing the overall usability of the error handling.

    7

    Copy link
    Contributor

    @fengttt fengttt left a comment

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    test.
    group by 1, 2
    group by 1, 1
    group by 0
    group by -1
    group by 1+1
    group by 1.1
    Some of them are invalid, some i dont know ...

    also can 1, 2, 3 appear in having and order by?

    @aunjgr
    Copy link
    Contributor Author

    aunjgr commented Jun 27, 2024

    @fengttt MySQL only support standalone literal positive integers.

    These are supported:
    group by 1
    group by 1,2
    group by 1,1

    These are not supported:
    group by 1.1
    group by '1'
    group by 0
    group by -1
    group by 1+1

    I've verified it on MySQL both 5.7 and 8.0.

    ORDER BY is the same, and we've already supported it. HAVING doesn't support this feature, because HAVING is almost always followed by an expression (not a standalone literal).

    @aunjgr aunjgr force-pushed the groupby1 branch 2 times, most recently from b6151af to 3ab0520 Compare June 27, 2024 04:43
    @mergify mergify bot merged commit fd4574c into matrixorigin:main Jun 27, 2024
    17 of 18 checks passed
    @aunjgr aunjgr deleted the groupby1 branch June 27, 2024 11:58
    aunjgr added a commit to aunjgr/matrixone that referenced this pull request Jul 9, 2024
    Not supported:
    * GROUP BY '1'
    * GROUP BY 1 + 2
    * HAVING 1
    
    Approved by: @ouyuanning, @heni02
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Projects
    None yet
    Development

    Successfully merging this pull request may close these issues.

    5 participants