Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Spectra calculation with multiprocessing #2

Open
carueda opened this issue Jun 17, 2023 · 0 comments
Open

Spectra calculation with multiprocessing #2

carueda opened this issue Jun 17, 2023 · 0 comments

Comments

@carueda
Copy link
Member

carueda commented Jun 17, 2023

This is in principle just a nice-to-have feature as the program is typically to be simultaneously launched multiple times (be on a machine with multiple cores, over multiple machines in the cloud, etc.), one per day, so the parallelization aspect is pretty much already covered.

However, there's still parallelization that can be implemented for a single day too, in particular, to compute the spectra for the minute segments in the day, which is actually a pleasingly parallel workload. Covering this single-day use case in a performant manner (taking advantage of the multiple cores in a modern computer) would facilitate testing, verification, and tuning of parameters or metadata attributes prior to the launching of multiple days.

A possible strategy with not much change wrt current implementation:

  • Use multiprocessing's RawArray (or similar) to allocate shared memory for the audio segments to be processed
  • Use SoundFile's buffer_read_into to load the audio segments into the shared memory
  • Use typical Pool (or similar) strategy to dispatch the parallel processing of multiple audio segments, to then gather the resulting spectra.

One possible alternative/complementary approach is to use Dask.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant