Spectra calculation with multiprocessing #2

carueda · 2023-06-17T19:48:40Z

This is in principle just a nice-to-have feature as the program is typically to be simultaneously launched multiple times (be on a machine with multiple cores, over multiple machines in the cloud, etc.), one per day, so the parallelization aspect is pretty much already covered.

However, there's still parallelization that can be implemented for a single day too, in particular, to compute the spectra for the minute segments in the day, which is actually a pleasingly parallel workload. Covering this single-day use case in a performant manner (taking advantage of the multiple cores in a modern computer) would facilitate testing, verification, and tuning of parameters or metadata attributes prior to the launching of multiple days.

A possible strategy with not much change wrt current implementation:

Use multiprocessing's RawArray (or similar) to allocate shared memory for the audio segments to be processed
Use SoundFile's buffer_read_into to load the audio segments into the shared memory
Use typical Pool (or similar) strategy to dispatch the parallel processing of multiple audio segments, to then gather the resulting spectra.

One possible alternative/complementary approach is to use Dask.

The text was updated successfully, but these errors were encountered:

carueda added the performance label Aug 31, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Spectra calculation with multiprocessing #2

Spectra calculation with multiprocessing #2

carueda commented Jun 17, 2023 •

edited

Loading

Spectra calculation with multiprocessing #2

Spectra calculation with multiprocessing #2

Comments

carueda commented Jun 17, 2023 • edited Loading

carueda commented Jun 17, 2023 •

edited

Loading