-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
remove()
to complement add()
?
#30
Comments
So the main complication is that a lot of data is matched by position index in array. So a temporary removal could bethis.source[idx] = null Then a lot of null check and invalid check. A more complete removal would require to map all old position to new position. The add() method I can see useful where a client is being drip fed documents. In any case, I'm not sure the benefit of having this in library vs a soft delete by the user of library. I'm also working on the assumption that the index can be rebuilt on demand almost instantaneously, I just think it's a waste to recompute it on each search (but many library do pay that cost). |
As I said right now I gravitate invalidate & rebuild cache. |
Hello, @jeancroy I LOVE your library! It's fantastic. So much better than the many other popular fuzzy search libraries.
A lot of your writing on the theory and algorithms implementation goes over my head and I don't have time to get into that, but I love that it's there. The mere presence of extensive intellectual stuff gives me extra confidence in your library, plus I can tell you really value the things you create, and aim to produce stuff that works well. So I give your library many stars! ⭐ ⭐ ⭐ ⭐ ⭐ ⭐
I also LOVE how you have an add method, it fits my use case perfectly. My use case is adding documents (and queries, but they are shorter, and are easier to handle), to an index inside of a desktop app that archives your browsing history so you can read it when you are offline. I recently added full text search to this (using flexsearch and ndx), but while these are good (and serializedable, to save time on restarting the app), they do not support typos. So I went searching for a fuzzy library, and after a few false starts, ended up here (actually via the GitHub search, not by npm search nor by Google search). I'm really thankful I did end up here.
Even if you can't implement any of these things (such as serialization, or remove), I am still immensely grateful for your library. Thank you for crafting such a thing of beauty and releasing into the world for all to use!
Anyway, I'd like there to be a remove, but I know it's not your job to make it, and I am sure you are in any case super busy. But let me just describe it here for the sake of the issue.
If I added a document to the index using
add()
, and then laterremoved()
that document from the index, the document would not show up in search results, nor in serializations (such as they may be) of the index.My particular motivation for this feature is that very often, a webpage (with the same URL) will add additional content (such as comments on a story), and my archiver would like to index those so that the person using the archiver can get (and search for) the most up-to-date version of the content they have seen.
Right now my solution is simply to discard all fz-search results that are older duplicates, but I am worried about the scalability of this, so have delayed deploying it to production. This is because people may re-index a page on the change of its content many times, but if I never remove those pages from the fuzzy search index, my index will end up dominated by pages that I no longer need and are out of date. So you can see I'm very scared of this possibility! It would be terrible!
But it's OK. If that's the case that you cannot implement a remove (or even serialization--I commented on another issue #21 ), I will still use your wonderful fz-search library for providing typeahead and suggestion for search queries (just like Google search), rather than on results. It will be a little sad to not be able to search with typos, but I'm sure I can figure out another way to do it somehow.
Please delete this issue if you already have this functionality (how embarrassing! 😱 )
The text was updated successfully, but these errors were encountered: