Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid leading and trailing zeros in test_timestamp_seconds_rounding_necessary #9970

Merged
merged 3 commits into from
Dec 7, 2023

Conversation

thirtiseven
Copy link
Collaborator

@thirtiseven thirtiseven commented Dec 6, 2023

Fixes #9923

timestamp_seconds will check for rounding necessary before overflow, the Decimal(20,7) case satisfies both rounding necessary and overflow, so it should complain about rounding necessary. The first element in the df with the failed seed is 1793879511158.1649100, so it bypasses the rounding necessary check and passes the overflow check next.

This PR adds a new parameter full_precision in integration test DecimalGen to avoid it generating data with leading and tailing zeros, and sets it to True in test_timestamp_seconds_rounding_necessary.

@res-life
Copy link
Collaborator

res-life commented Dec 6, 2023

LGTM

res-life
res-life previously approved these changes Dec 6, 2023
@thirtiseven
Copy link
Collaborator Author

build

@@ -244,7 +244,7 @@ def start(self, rand):

class DecimalGen(DataGen):
"""Generate Decimals, with some built in corner cases."""
def __init__(self, precision=None, scale=None, nullable=True, special_cases=None, avoid_positive_values=False):
def __init__(self, precision=None, scale=None, nullable=True, special_cases=None, avoid_positive_values=False, full_precision=False):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we add a comment/doc string about what full_precision means? It is not clear from just the name of it.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

Copy link
Collaborator

@winningsix winningsix left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The first element in the df with the failed seed is 1793879511158.1649100, so it bypasses the rounding necessary

Can you help elaborate this more and put this test comments?

To my understanding, this is a problem when we generating data with tailing/heading zeros, it mislead the overflow exceptions.

Is this an issue of test (data) design or our code?


@pytest.mark.parametrize('data_gen', [DecimalGen(7, 7), DecimalGen(20, 7)], ids=idfn)

# Make sure every decimal value in test data is rounding necessary by set full_precision=True to
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: necessarily

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I want to keep it as is because it's the error message, so added a quote around it.

@pytest.mark.parametrize('data_gen', [DecimalGen(7, 7), DecimalGen(20, 7)], ids=idfn)

# Make sure every decimal value in test data is rounding necessary by set full_precision=True to
# avoid leading and trailing zeros
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: tailing

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems trailing zeros is correct, I made a typo in PR title.

@thirtiseven thirtiseven changed the title Avoid leading and tailing zeros in test_timestamp_seconds_rounding_necessary Avoid leading and trailing zeros in test_timestamp_seconds_rounding_necessary Dec 7, 2023
@thirtiseven
Copy link
Collaborator Author

build

Signed-off-by: Haoyang Li <[email protected]>
@thirtiseven
Copy link
Collaborator Author

Can you help elaborate this more and put this test comments?

To my understanding, this is a problem when we generating data with tailing/heading zeros, it mislead the overflow exceptions.

Is this an issue of test (data) design or our code?

I added a test comments. It's an issue of test data.

@thirtiseven
Copy link
Collaborator Author

build

@@ -244,7 +244,8 @@ def start(self, rand):

class DecimalGen(DataGen):
"""Generate Decimals, with some built in corner cases."""
def __init__(self, precision=None, scale=None, nullable=True, special_cases=None, avoid_positive_values=False):
def __init__(self, precision=None, scale=None, nullable=True, special_cases=None, avoid_positive_values=False, full_precision=False):
"""full_precision: If True, generate decimals with full precision without leading and trailing zeros."""
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, will trim_zeros be a better name?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’m ok with this name, but I personally think trim_zeros means removing leading and trailing zeros from a number, instead of not generating it.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good. Merged

@winningsix winningsix merged commit 3950fa0 into NVIDIA:branch-23.12 Dec 7, 2023
36 checks passed
@thirtiseven thirtiseven deleted the roundingNecessary branch December 19, 2023 08:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
test Only impacts tests
Projects
None yet
4 participants