Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistent aromaticity of N-heterocycles #8

Open
bbucior opened this issue Feb 7, 2019 · 0 comments
Open

Inconsistent aromaticity of N-heterocycles #8

bbucior opened this issue Feb 7, 2019 · 0 comments

Comments

@bbucior
Copy link
Contributor

bbucior commented Feb 7, 2019

Certain heterocyclic compounds can yield unexpected SMILES in the upstream Open Babel library. For example, the code does not correctly assign bond orders to L_13 in the ToBaCCo MOFs. This effect occurs inconsistently, sometimes leading to multiple copies of the same linker with slightly different SMILES due to different bond orders or the introduction of radical notation.

Even if the bug is fixed upstream, I suspect the presence of charged organic molecules may exacerbate the issue. Currently, framework.cpp assigns formal charges to the carboxylate and certain rings after the bond orders are detected. Aromaticity detection may be improved if the overall partial charge is assigned before running OBMol::PerceiveBondOrders, for example by using the number/location of coordinated metals.

These are some potentially relevant issues on the upstream project for reference:

bbucior added a commit that referenced this issue Feb 17, 2019
Interpreting the results of check_mof_linkers.py to establish the percent accuracy (and common sources of mismatch) between a MOF's GA/ToBaCCo "recipe" and the MOFid of the resulting structure.  Mostly there are exact matches, though there are some instances of either unexpected structures and incompatibility with MOFid, particularly due to an upstream bug in #8.  This commit adapts some previous visualizations from earlier work, e.g. from Notebooks/20181211-github-refactoring/StartValidationTest/CheckToBaCCo/NewResults20190207/plot_tobacco_viz.R

I also identified an error class missing from the Python validation test, which corresponds to the case when sbu.cpp cannot find a MOF structure in the CIF.  This case will have a signature of `no_mof` in the catenation field.
bbucior added a commit that referenced this issue Feb 18, 2019
Exporting a format derived from InChIKey to provide a compact, interoperable hash in addition to the SMILES-based MOFid.  Since the organic portion is based on InChIKey, the user can search for the building blocks in specialized and even general search engines.

Note: invoking the InChI code may add substantially more warning/error messages to the log file, particularly for metal-containing linkers and cases when the valence is buggy (e.g. ZIFs and cases related to issue #8).  I still need to write some more diagnostic output and discuss a few decision points, such as the `DEFAULT_MOFKEY_TOPOLOGY`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant