-
Notifications
You must be signed in to change notification settings - Fork 95
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Can ANARCI support hmm search with a specific chain? #74
Comments
If you want only heavy chain you could do |
It seems like this is a bug. the reason is chain_type in number_seqquce_from_alignment is not enforced. When I set allow to set(["L"]), it returns both "K" and "L" based on the code:
In my case the "K" hit happens to be the one with higher bit_score, so it returns the "K" result instead of the "L" as specified. Could this be fixed? Thanks. |
The K and L stand for the Kappa and Lambda chain which are both light chains. https://en.wikipedia.org/wiki/Immunoglobulin_light_chain I am not a developer by the way. |
I check further and I think I found the root cause. My sequence matches both K and L, K is the best hit and L is the second hit. In _parse_hmmer_query returns "top_descriptions". However, the chain_type is set to the chain type of the top hit. I.e., in my case, althoug Later in number_sequences_from_alignment, we have The L-chain hit was seconded to the K-chain hit, and it's chain type was not used. So the L-chain hit did not survive in the "allow" filter. To me, it seems one solution is in my first post, i.e., to consider "allow" logic within _parse_hmmer_query, so that "K" top hit will be filtered out. Then the "L" hit can survive. |
Or the document should say "K" and "L" should not be used alone, they must be used together as set(["K","L"]). I am doing that as the workaround. Thanks. |
We often need to explicitly search CDRs for a heavy chain or a light chain. E.g., we have a nanobody sequence (with heavy and light chain fused), we don't want to get heavy chain CDRs or light chain CDRs randomly. We prefer to use anarci to first scan for heavy chain CDRs, and then scan for light chain CDRs.
This means we should modify the following function to consider an optional chain type argument.
_parse_hmmer_query(query, bit_score_threshold=80, hmmer_species=None, chain_type=None):
Thanks for the consideration.
The text was updated successfully, but these errors were encountered: