Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mapping with no element change fails #61

Open
pbuslaev opened this issue Dec 16, 2024 · 5 comments
Open

Mapping with no element change fails #61

pbuslaev opened this issue Dec 16, 2024 · 5 comments

Comments

@pbuslaev
Copy link

I was playing a bit with Lomap and noticed a strange behaviour when MCS is computed with element_change=False. When called with c1ccccc1 and c1ccncc1 (So one carbon in a ring changed to a nitrogen), Lomap can't find a mapping (see the code below for reproduction).

from rdkit import Chem
from rdkit.Chem import AllChem, rdMolAlign, rdFMCS

m1 = Chem.MolFromSmiles("c1ccccc1")
m2 = Chem.MolFromSmiles("c1ccncc1")
m1 = Chem.AddHs(m1)
m2 = Chem.AddHs(m2)
AllChem.EmbedMolecule(m1)
AllChem.EmbedMolecule(m2)

mcs = rdFMCS.FindMCS(
    [Chem.RemoveHs(m1), Chem.RemoveHs(m2)],
).smartsString
substr = Chem.MolFromSmarts(mcs)

m1_idx_list = Chem.RemoveHs(m1).GetSubstructMatch(substr)
m2_idx_list = Chem.RemoveHs(m2).GetSubstructMatch(substr)
heavy_atom_mapping = [(i, j) for i, j in zip(m1_idx_list, m2_idx_list)]
rdMolAlign.AlignMol(m1, m2, atomMap=heavy_atom_mapping)

print(rdMolAlign.CalcRMS(m1, m2, map=[heavy_atom_mapping]))

from lomap import MCS

mcs = MCS(m1, m2, threed=True, element_change=False)
print(list(mcs))

As far as I can tell, this is due to the logics in these lines. So, when element_change is expexted, the RdKit MCS search is called with completeRingsOnly=True and atomCompare=rdFMCS.AtomCompare.CompareElements flags. Since N != C, there are no rings, and no atoms matched. I am not sure that this is the expected behaviour. Maybe, a slightly different logics is worth considering:

If element_change == False, we run a standard MCS, but then we remove atoms with different mass from the mapping. Also some hydrogen mapping should be removed. If you think this is a potential way to go, I can work on the implementation.

@IAlibay
Copy link
Member

IAlibay commented Dec 17, 2024

Thanks for raising this issue @pbuslaev - we'll have a deeper look at this most likely after the winter break.

A very rough initial question is that it seems like you are expecting a mapping that allows for ring breaking. Is this intentional / needed? Generally ring breaking leads to undefined behaviour in the types of hybrid topology schemes that Lomap tends to be used in - so it's not a use case that gets a lot of attention.

@pbuslaev
Copy link
Author

I am quite sure that benzene to pyridine is not a ring opening change. I would say that in general, mutation of aromatic carbon to aromatic nitrogen is a typical change. I should say here, that maybe the behaviour is correct, and if you can not map all atoms of the ring, you should not give the mapping. It is just not what I expected from parameter naming and documentation. I expected 5 carbons to be matched, when element_change=False. For some practical reasons, this kind of mapping can be expected by the user. Also, this flag is sort of recommended by OpenFE tutorial. So, I was just testing how it behaves and encountered something that I was not expecting.

@IAlibay
Copy link
Member

IAlibay commented Dec 18, 2024

I am quite sure that benzene to pyridine is not a ring opening change.

I'm not sure I understand what you mean here @pbuslaev - if you disallow element changes, you explicitly say that you cannot map a C->N transformation. This means that in the case of a benzene to pyridine change you would be attempting to map a ring break, because there would be at least one atom unmapped in the ring.

I expected 5 carbons to be matched, when element_change=False

This is what I mean by the above, if you only map 5 carbons, you effectively return a ring break mapping. This would not work with various hybrid topology schemes, including the one we provide in openfe.

Also, this flag is sort of recommended by OpenFE tutorial

This is likely an oversight. Nowadays the openfe protocol does handle element changes and it is our default (or at least it should be if it's not everywhere). I'll open up an issue upstream to fix that.

@pbuslaev
Copy link
Author

This is what I mean by the above, if you only map 5 carbons, you effectively return a ring break mapping. This would not work with various hybrid topology schemes, including the one we provide in openfe.

I understand your point, but at least from the documentation I got a different understanding of the element_change flag. So the actual and I guess the desired behaviour is when the flag is on, benzene can't be mapped to pyridine. While my understanding was that one of the carbons won't be mapped to a nitrogen.

What I was trying to do, is to compute how many element change mutations I have in the mapping. I can easily do it downstream (by checking if atom masses of mapped atoms are the same or not), but I hoped for a nicer solution by mapping with element_change flag on and off. I got some strange results, and reported it. So, as I mentioned before, maybe the desired behaviour can be simply documented in more detail and this is it?

At the same time I should say, that there are schemes out there, where 5-atom mapping for the situation I described can be useful, but again, it is easy to get eat downstream

@IAlibay
Copy link
Member

IAlibay commented Dec 18, 2024

Ah I think I understand what you mean here @pbuslaev, thanks for the further explanation!

I agree that better documentation would be nice (or actually any.. Lomap is very poorly documented currently).
I do wonder if your use case might be better handled with another tool, maybe something like Kartograf, but it would require a bit more thought.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants