-
Notifications
You must be signed in to change notification settings - Fork 191
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Suggestion to default to use_binary_file=True
for KS4 sorting
#3611
Comments
This was also found to be true for Mountainsort5 which started caching the recording (ie writing to disk) before sorting which leads to great speed-ups to the sorting itself. So we see this in multiple sorters. One hypothesis that may contribute in the case of KS4 is that although they use spikeinterface objects natively they do not make use of any of our multiprocessing. Maybe their binary support has speed-ups that they have not applied to our native objects. They directly say this in the comments of their code (the comment about not using spikeinterface multiprocessing for their wrapper of a recording). |
@samuelgarcia do you want to argue for keeping the current default? I think @alejoe91 may have also slightly leaned toward the current default too? |
Given that this came up several times, I think we should have |
That's a really chill PR @guptadivyansh any interest in trying it yourself or do you want us to just deal with it? |
Sure, I'll give it a go! (excited for first contribution) But first, do others also see such a big difference in performance between the two? |
I've found that the difference in performance depends on the system I'm using. On my Mac, it didn't make a big difference, but on the HPC I use, it did! No idea why... |
Continuing the discussion from the merged PR here..
The suggestion is to make change the default value for the parameter.
The flag
use_binary_file
makes a big difference to KS4 sorting performance. Writing the binary does take a while but overall I managed to sort a ~2 hour recording in ~4 hours withuse_binary_file = True
compared to ~22 hours with the defaultuse_binary_file = False
for the same recording. So, thank you for adding this param!Workstation: Intel i7 8700K, Nvidia 3060 12GB, 64GB RAM, SSD, Windows
It would be good to hear if others are also seeing such a big performance difference.
Although as @sameulgarcia mentioned, it is weird that KS4 is so slow when passed the SI recording internally. If someone can get to the bottom of that, then perhaps we could have the best of both worlds; fast sorting without explicitly writing to disk.
In the meantime,
use_binary_file = True
might be a better defaultOriginally posted by @guptadivyansh in #3339 (comment)
The text was updated successfully, but these errors were encountered: