Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Increase expected size of output sample buffer, to correctly handle input termination. #22

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

arlofaria-cartesia
Copy link

Problem: when using the full API (.process() method in this wrapper), the number of output samples is incorrect for a chunked input at the end of a stream. This is because the calculation of new_size assumes that the number of output samples that will be generated will be at most the number of input samples times the resampling ratio. While that's generally true for chunks processed mid-stream (as well as at the start of a stream), when the input is terminated it becomes possible for the output samples to exceed this expected length because the resampler's buffer is being flushed of any remaining data.

Solution: an ideal solution would be to determine the actual number of samples that will be flushed on an input-terminating call. This is complicated to do, for various reasons related to issues such as libsndfile/libsamplerate#6

As a hacky workaround, we can figure that adding a "fudge factor" of 10,000 samples ought to be sufficient. In case that's not enough, a RuntimeError will be raised.

Testing: Here's a modified version of the example code on the README ... perhaps this chunked usage should be included in the documentation?

#!/usr/bin/env python3

import numpy as np
import samplerate

# Synthesize data
fs = 1000.
t = np.arange(fs * 2) / fs
input_data = np.sin(2 * np.pi * 5 * t)

# Simple API
ratio = 1.5
converter = 'sinc_best'  # or 'sinc_fastest', ...
output_data_simple = samplerate.resample(input_data, ratio, converter)

# Full API
resampler = samplerate.Resampler(converter, channels=1)
output_data_full = resampler.process(input_data, ratio, end_of_input=True)

# The result is the same for both APIs.
assert np.allclose(output_data_simple, output_data_full)

# Full API, chunked
resampler.reset()
output_data_chunked = np.array([], dtype=np.float32)
chunk_size = 100
for i in range(0, len(input_data), chunk_size):
    chunk = input_data[i : i+chunk_size]
    resampled = resampler.process(chunk, ratio, end_of_input=False)
    print(f"{len(chunk)} input samples --> {len(resampled)} output samples")
    output_data_chunked = np.concatenate((output_data_chunked, resampled))
print(f"{len(output_data_chunked)} output samples before input is terminated")

# Terminate with an empty final chunk.
resampled = resampler.process([], ratio, end_of_input=True)
print(f"input terminated --> {len(resampled)} output samples")
output_data_chunked = np.concatenate((output_data_chunked, resampled))
print(f"{len(output_data_chunked)} output samples after input is terminated")

# The result is the same for all APIs.
assert np.allclose(output_data_chunked, output_data_simple)

# See `samplerate.CallbackResampler` for the Callback API, or
# `examples/play_modulation.py` for an example.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant