You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In first tests, https://github.com/google/zoekt is between 2-10x faster than codesearch and degrades much more gracefully for pathological queries (queries which have many potential matches).
For 1.4G of source code, zoekt writes a 1.7G index, which is a 1.21x blow-up. Our nodes currently have 22-24G used and 52-54G available, so disk-wise, we could actually switch to zoekt.
TODO list:
How can we keep our incremental indexing, i.e. could we store one zoekt shard per package, and/or could we merge the per-package shards into a single big shard?
zoekt by default indexes into 1 file per repository, so if we treat one debian package as one repository, we already get cheap updates.
Which features (query keywords) would we need to drop, which could we keep with a compatibility layer?
Do we need to fork zoekt to get all the features our search result page has (context lines etc.)?
zoekt does not sort the results within a file, at least not within its own UI
there are no context lines around matches in zoekt
How do we get our own ranking into zoekt?
Could we use the repo/branch feature of zoekt for multiple Debian versions (e.g. sid, testing, …)?
How much extra disk space would adding other Debian versions need?
The text was updated successfully, but these errors were encountered:
In first tests, https://github.com/google/zoekt is between 2-10x faster than codesearch and degrades much more gracefully for pathological queries (queries which have many potential matches).
For 1.4G of source code, zoekt writes a 1.7G index, which is a 1.21x blow-up. Our nodes currently have 22-24G used and 52-54G available, so disk-wise, we could actually switch to zoekt.
TODO list:
The text was updated successfully, but these errors were encountered: