Increase expected size of output sample buffer, to correctly handle input termination. #22

arlofaria-cartesia · 2024-06-11T20:56:57Z

Problem: when using the full API (.process() method in this wrapper), the number of output samples is incorrect for a chunked input at the end of a stream. This is because the calculation of new_size assumes that the number of output samples that will be generated will be at most the number of input samples times the resampling ratio. While that's generally true for chunks processed mid-stream (as well as at the start of a stream), when the input is terminated it becomes possible for the output samples to exceed this expected length because the resampler's buffer is being flushed of any remaining data.

Solution: an ideal solution would be to determine the actual number of samples that will be flushed on an input-terminating call. This is complicated to do, for various reasons related to issues such as libsndfile/libsamplerate#6

As a hacky workaround, we can figure that adding a "fudge factor" of 10,000 samples ought to be sufficient. In case that's not enough, a RuntimeError will be raised.

Testing: Here's a modified version of the example code on the README ... perhaps this chunked usage should be included in the documentation?

#!/usr/bin/env python3

import numpy as np
import samplerate

# Synthesize data
fs = 1000.
t = np.arange(fs * 2) / fs
input_data = np.sin(2 * np.pi * 5 * t)

# Simple API
ratio = 1.5
converter = 'sinc_best'  # or 'sinc_fastest', ...
output_data_simple = samplerate.resample(input_data, ratio, converter)

# Full API
resampler = samplerate.Resampler(converter, channels=1)
output_data_full = resampler.process(input_data, ratio, end_of_input=True)

# The result is the same for both APIs.
assert np.allclose(output_data_simple, output_data_full)

# Full API, chunked
resampler.reset()
output_data_chunked = np.array([], dtype=np.float32)
chunk_size = 100
for i in range(0, len(input_data), chunk_size):
    chunk = input_data[i : i+chunk_size]
    resampled = resampler.process(chunk, ratio, end_of_input=False)
    print(f"{len(chunk)} input samples --> {len(resampled)} output samples")
    output_data_chunked = np.concatenate((output_data_chunked, resampled))
print(f"{len(output_data_chunked)} output samples before input is terminated")

# Terminate with an empty final chunk.
resampled = resampler.process([], ratio, end_of_input=True)
print(f"input terminated --> {len(resampled)} output samples")
output_data_chunked = np.concatenate((output_data_chunked, resampled))
print(f"{len(output_data_chunked)} output samples after input is terminated")

# The result is the same for all APIs.
assert np.allclose(output_data_chunked, output_data_simple)

# See `samplerate.CallbackResampler` for the Callback API, or
# `examples/play_modulation.py` for an example.

arlofaria-cartesia added 2 commits June 11, 2024 13:12

Increase expected size of output sample buffer.

272a4a6

Use a macro definition and throw/raise a runtime error/exception.

62d5c16

arlofaria-cartesia mentioned this pull request Jun 17, 2024

Increase expected size of output sample buffer, to correctly handle input termination cartesia-ai/python-samplerate#1

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Increase expected size of output sample buffer, to correctly handle input termination. #22

Increase expected size of output sample buffer, to correctly handle input termination. #22

arlofaria-cartesia commented Jun 11, 2024

Increase expected size of output sample buffer, to correctly handle input termination. #22

Are you sure you want to change the base?

Increase expected size of output sample buffer, to correctly handle input termination. #22

Conversation

arlofaria-cartesia commented Jun 11, 2024