Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: metric selection for faster calculation #1

Merged
merged 2 commits into from
Nov 15, 2024
Merged

Conversation

tomsail
Copy link
Contributor

@tomsail tomsail commented Nov 8, 2024

Added new features:

  • keyword metrics which can be str or list. Default None selects all metrics.
  • added more thorough checks in tests
  • added duration print in pytest which gives the time spent for each function:

=============================================================================================== test session starts ================================================================================================
platform linux -- Python 3.12.4, pytest-8.2.2, pluggy-1.5.0 -- /home/tomsail/work/python/seareport_org/seastats/.venv/bin/python
cachedir: .pytest_cache
rootdir: /home/tomsail/work/python/seareport_org/seastats
configfile: pyproject.toml
plugins: anyio-4.4.0
collected 5 items                                                                                                                                                                                                  

tests/compute_stats_test.py::test_get_all_stats_and_extremes PASSED                                                                                                                                          [ 20%]
tests/compute_stats_test.py::test_get_all_stats PASSED                                                                                                                                                       [ 40%]
tests/compute_stats_test.py::test_get_rmse PASSED                                                                                                                                                            [ 60%]
tests/compute_stats_test.py::test_get_slope_pp PASSED                                                                                                                                                        [ 80%]
tests/compute_stats_test.py::test_get_storm_metrics PASSED                                                                                                                                                   [100%]

================================================================================================ slowest durations =================================================================================================
0.26s call     tests/compute_stats_test.py::test_get_all_stats_and_extremes
0.24s call     tests/compute_stats_test.py::test_get_all_stats
0.07s call     tests/compute_stats_test.py::test_get_slope_pp
0.01s call     tests/compute_stats_test.py::test_get_storm_metrics
0.01s call     tests/compute_stats_test.py::test_get_rmse
0.00s setup    tests/compute_stats_test.py::test_get_all_stats_and_extremes
0.00s teardown tests/compute_stats_test.py::test_get_all_stats_and_extremes
0.00s teardown tests/compute_stats_test.py::test_get_all_stats
0.00s teardown tests/compute_stats_test.py::test_get_storm_metrics
0.00s teardown tests/compute_stats_test.py::test_get_rmse
0.00s setup    tests/compute_stats_test.py::test_get_all_stats
0.00s teardown tests/compute_stats_test.py::test_get_slope_pp
0.00s setup    tests/compute_stats_test.py::test_get_storm_metrics
0.00s setup    tests/compute_stats_test.py::test_get_rmse
0.00s setup    tests/compute_stats_test.py::test_get_slope_pp

NB: Calculating percentiles is the most intensive calculation, even more than POT selection.

@tomsail
Copy link
Contributor Author

tomsail commented Nov 8, 2024

in get_stats(), I'm not sure that the implementation of:

    def should_calculate(metric_name):
        return metrics is None or metric_name in metrics_list

and then its repetition is the most elegant way of doing things, but it works.

@tomsail tomsail requested a review from pmav99 November 8, 2024 12:27
seastats/stats.py Outdated Show resolved Hide resolved
seastats/stats.py Outdated Show resolved Hide resolved
tests/compute_stats_test.py Outdated Show resolved Hide resolved
tests/compute_stats_test.py Outdated Show resolved Hide resolved
seastats/stats.py Outdated Show resolved Hide resolved
seastats/stats.py Outdated Show resolved Hide resolved
seastats/stats.py Outdated Show resolved Hide resolved
@@ -28,7 +28,7 @@ jobs:
fail-fast: false
matrix:
os: ["ubuntu-latest"]
python: ["3.9", "3.10", "3.11", "3.12"]
python: ["3.10", "3.11", "3.12"]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Try to run 3.13, too. 3.13.0 has already been released. Some of our dependencies might be broken, but let's give it a try.

seastats/stats.py Outdated Show resolved Hide resolved
@tomsail
Copy link
Contributor Author

tomsail commented Nov 9, 2024

I have addressed the various things we've discussed yesterday.

  • made visible the quantile and round parameters. Both have default values (0 and -1) and deactivated.
  • cleaned the tests and used pytest.approx
  • using now by default GENERAL_METRICS and STORM_METRIC parameters lists for get_stats() and storm_metrics().
  • vectorised the percentile calculation so it's now faster and made the slope/intercept API simpler

seastats/__init__.py Outdated Show resolved Hide resolved
seastats/__init__.py Outdated Show resolved Hide resolved
seastats/__init__.py Outdated Show resolved Hide resolved
seastats/__init__.py Show resolved Hide resolved
@tomsail tomsail merged commit bda9c00 into main Nov 15, 2024
5 checks passed
@tomsail tomsail deleted the select_metric branch November 15, 2024 15:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants