-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prefer full-text query match on ordering #34
Comments
Hi @jmargatan, you are absolutely right. There is an open issue on the fuzzy search module used here: mattyork/fuzzy#3. I am sure they would accept a pull request implementing this. |
@janraasch: I glanced through mattyork/fuzzy's code. The current aggressive/greedy matching strategy will need some major refactoring to have full-text capability and this may not align with the direction of the project, "1k standalone fuzzy search / fuzzy filter". After digging into this, I think the most optimal way to achieve a full-text search will require some dynamic programming approach, commonly referred to as the LCS problem. It won't be as fast as greedy matching--"fast" is a relative term--but we can tweak it a little bit by caching intermediary computational result. As I start typing my library I bumped into jeancroy/fuzzaldrin-plus. I read through the description, it looked promising, haven't looked at the code yet, but it seems to align to what we want to achieve. Would you be open to check if this would be a good alternative? Thanks in advance. |
Hi @jmargatan. Thank you for your comment. I appreciate that you have not forgotten about this. I would like to reply with two things.
|
Hi @jmargatan, I finally remember the other js-library I really wanted to use for this, but could not at the time. Now, once krisk/Fuse#6 lands in a non-beta release of https://github.com/krisk/Fuse I think this will fix our issue. |
Alright, good news. This seems to work. Would you be willing to test this? I released a beta for you to check it out: https://github.com/janraasch/tab-ahead/releases/tag/1.3.0-beta1. You can download the I would love to get your feedback on this, @jmargatan. |
Woohoo. I'll give it a try once I get home tonight. Thanks @janraasch! |
@janraasch I wasn't able to test it. I disabled the original one and upload the beta version as you suggested but Alt+T does not work and the search box itself does not produce any matches. Thoughts? |
hi @jmargatan, I'm sorry. You're right. There is some problem when packaging the new dependencies. Don't worry, I'll figure it out and release another beta. Thank you for being willing to test this. |
Alright, can you please try again with https://github.com/janraasch/tab-ahead/releases/tag/v1.3.0-beta2. Also, you do not have to deinstall the original version from the app store, the two can run alongside each other, so you can really compare the results. Finally, the keyboard shortcut can be configured on the Let me know, when you get around to having a look at this, @jmargatan, |
@janraasch, I think this new one is much better. Thanks for addressing this issue, I really appreciate it. Also thanks for pointing the keyboard shortcut. Although after uploading the the beta, the Alt-T shortcut stopped working. I have yet to figure that out. |
Very well then, I'll cut a release once the version of |
Hi, author of fuzzaldrin-plus here, sorry of being late to the party.
Actually both works moslty the same, the main difference is that there's a second match context in the file name.
Yes it was written as a substitute for addressing user report in atom text editor. fuzzaldrin = require("fuzzaldrin-plus")
//Filter a list of candidate - strings
candidates = ...
query = ...
results = fuzzaldrin.filter(candidates, query)
//Filter list of candidate objects
results = fuzzaldrin.filter(candidates, query, key: 'name')
I dont really like this syntax but it was what was required for atom. I may provide a function that wrap inside specified stard/end tag.
Fuse is based on edit distance. And thus is much better comparing word (eg spellcheck) between them than expanding shortcut into a large result. I see v2 of Fuse tokenize the candidate into multiple word then do the comparison over each, summing result. That approach may works well, up to the point the user type a bit of this word, and a bit of that word. You'll also see in the screenshot that every email address part is compared to |
@janraasch: The current solution is fantastic but some side of the aggressive matching still hurts the performance. Please refer to the example below. What's your thought on it? :) Thanks for the insight on jeancroy/fuzzaldrin-plus, @jeancroy, would be interesting to compare the result between these two. |
@jeancroy, thank you very much for your detailed explanation. I am sorry for answering so late, but I am quiet busy as I am sure we all are (open source... :)). @jmargatan Could you elaborate a little bit on what you mean by
I am sorry, but I am not sure what you mean just by looking at the screenshot. And, lastly:
Definitely, I agree. I will make sure to do this. (Now that I know the API ;)) |
@janraasch Certainly. :) On the screenshot, I was trying to illustrate there are 2 candidates and the first one does not have the full text, "yahoo mail", but the the second one does at the end. In this case, I would assume the second candidate should have been ranked higher. At least, from LCS perspective. What's your thought on it? |
Just to clarify is that screenshot taken using the beta version or the one currently available on the web store? |
This is from the Chrome Web Store version. I am not aware of a way to update a Chrome Extension. I assume I am on the latest version. Although the Web Store does say "Updated: November 16, 2015". |
Ok. |
Made this demo to toy with the project, hopefully it clarify the api/usage a bit too. |
@jeancroy: Whoa this is cool! Let's see what @janraasch thinks. :) |
IMHO the |
@jeancroy: For your examples you seem to be only searching in the var tabs = [{title: 'my-title', url: 'my-url'}]
var options = {
keys: ['url', 'title']
}
var f = new Fuse(tabs, options);
var result = f.search('my'); |
Alright, so here https://github.com/janraasch/tab-ahead/releases/tag/v1.3.0-beta3 is the current version built from You can download the Could you take this for a spin, @jmargatan? Maybe compare the results with your screenshot from this comment #34 (comment). |
@janraasch Thanks for packing up this build. I tested with both scenarios: From the second screenshot, the ranking does improve although the highlighted characters are rather confusing, note that the |
Ok, one little caveat as this discussion is getting quiet long: We have to be a little bit careful not too overestimate the meaning of a few handpicked samples, because there are currently about Which is why I would say even though
this is still a better solution, because
so I would like to
What do you say? |
There's the api to search for a specific key. fuzzaldrin.filter(candidates, query, {key:"title"})There's not yet the API to search for multiple of them. I'm thinking of averaging the score over search space, maybe let the user specify a weigh for every property. Other option is to select the best field, and still weight from which field it is. I may add it, I'm a bit busy now, and most importantly, i'm now being super careful with updating that project because atom has so many users. It is still possible to achieve this goal using the fuzzaldrin.score(candidate,query) inside a double loop: foreach items / foreach indexedFields
I guess it depend on the complexity of the search text as well as the expectation of the user.
That sound like I plan. I can ping back if the multiple field feature is implemented. Or more cleanly you can open/subscribe to an issue that ask for it |
@jeancroy thank you very much for the detailed insight into With your comment
I am now even more confident to release the |
I agree with you both. Thank you both for looking into this. |
Just released |
Consider the following use case.
Tab Opens:
When the query is
gmail
, the results are ordered as such (bold indicates highlighted letters):The proposal is to return the tab with full-text match on top of the fuzzy result, here is how it looks:
@janraasch: What do you think? :)
The text was updated successfully, but these errors were encountered: