reduce len() calls #412

bvandercar-vt · 2024-10-10T17:05:35Z

Reduce duplicate len() calls by assigning length to variable.

Probably only has a microscopic effect on performance, but it is something.

maxbachmann

I am not really convinced this will make a measurable difference in performance. However I would be fine with the changes in the core library as long as the bug in the Cython implementation is fixed.

maxbachmann · 2024-12-10T16:47:07Z

bench/benchmark_scorer.py

-    print("Words :", len(words))
-    print("Sample:", len(sample))
+    print("Words :", len_words)
+    print("Sample:", len_sample)


The changes in the bench directory aren't relevant, since they are outside the benchmarks. So readability is more important here. Really this part is super slow because of:

words = ["".join(random.choice(string.ascii_letters + string.digits) for _ in range(10)) for _ in range(10000)]

anyway

So readability is more important here.

That makes sense. Reverted changes to the bench folder.

src/rapidfuzz/fuzz_py.py

maxbachmann · 2024-12-10T16:50:16Z

src/rapidfuzz/process_cpp_impl.pyx

@@ -1208,6 +1208,7 @@ def extract(query, choices, *, scorer=WRatio, processor=None, limit=5, score_cut
    cdef RF_Scorer* scorer_context = NULL
    cdef RF_ScorerFlags scorer_flags
    cdef int64_t c_limit
+    cdef int64_t choices_len = <int64_t>len(choices)


This will crash when choices is a generator. So this would need to be handled inside the try/except

That makes sense, nice catch, changed: 74b248d

src/rapidfuzz/process_py.py

maxbachmann · 2025-01-08T11:06:37Z

src/rapidfuzz/process_cpp_impl.pyx

+    if limit is None or limit > choices_len:
+        limit = choices_len

    c_limit = limit


It's unnecessary to convert choices_len back to a Python Object here. This could probably be something like:

Suggested change

if limit is None or limit > choices_len:

limit = choices_len

c_limit = limit

c_limit = choices_len

if limit is not None:

c_limit = min(c_limit, <int64_t>limit)

To be honest I'm not really sure what's happening here. Sounds good, made that change.

bvandercar-vt and others added 2 commits October 10, 2024 11:04

reduce len() calls

ba243a8

Merge branch 'rapidfuzz:main' into bvandercar/reduce-len-calls

0323bc8

maxbachmann requested changes Dec 10, 2024

View reviewed changes

bvandercar-vt added 4 commits January 6, 2025 11:32

Merge branch 'main' into bvandercar/reduce-len-calls

04edb16

style: revert bench folder changes

a011807

move len to try block

74b248d

refactor: move common code

c8cfbca

maxbachmann reviewed Jan 8, 2025

View reviewed changes

implement change

4f41086

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

reduce len() calls #412

reduce len() calls #412

bvandercar-vt commented Oct 10, 2024

maxbachmann left a comment

maxbachmann Dec 10, 2024

bvandercar-vt Jan 6, 2025

maxbachmann Dec 10, 2024

bvandercar-vt Jan 6, 2025 •

edited

Loading

maxbachmann Jan 8, 2025

bvandercar-vt Jan 8, 2025 •

edited

Loading

reduce len() calls #412

Are you sure you want to change the base?

reduce len() calls #412

Conversation

bvandercar-vt commented Oct 10, 2024

maxbachmann left a comment

Choose a reason for hiding this comment

maxbachmann Dec 10, 2024

Choose a reason for hiding this comment

bvandercar-vt Jan 6, 2025

Choose a reason for hiding this comment

maxbachmann Dec 10, 2024

Choose a reason for hiding this comment

bvandercar-vt Jan 6, 2025 • edited Loading

Choose a reason for hiding this comment

maxbachmann Jan 8, 2025

Choose a reason for hiding this comment

bvandercar-vt Jan 8, 2025 • edited Loading

Choose a reason for hiding this comment

bvandercar-vt Jan 6, 2025 •

edited

Loading

bvandercar-vt Jan 8, 2025 •

edited

Loading