-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add facilities for crystal structures and polymers #33
Comments
Probably makes sense to default to MEGNet for ease of use. @sp8rks mentioned that the Liverpool group has crystal similarity measures that we can use based on an attention network. Ideally that crystal similarity measure would be packaged on PyPI (i.e. pip-installable) and have a function that takes two pymatgen |
Some places that need to change:
Probably best to start by modifying and testing the bare bones example: This is something that a collaborator can modify without knowledge of the EDIT: For evaluation metrics, could keep the element metrics and instead of new chemical formula, check if there's a new space group represented. Could also be new space group + new number of sites. |
Based on email discussion: Taylor brought up some great points, and I think this is an exciting project. There's been a push/encouragement both internal and external to incorporate structure into the search for high-performing, novel materials, and I think this will be a timely extension of DiSCoVeR. Weighting
For the weighting, perhaps we could use chimera as the scalarizing function. Alternatively, I think it would be interesting/best practice to use these two as separate objectives in a multi-objective optimization via e.g. expected hypervolume improvement. In other words, a mathematically robust way of collapsing multiple objectives in the context of observed data into a single number. Expected hypervolume improvement is taken care of implicitly with most sophisticated multi-objective optimization platforms. Another option would be using an expected improvement acquisition function, except where the novelty proxy takes the place of uncertainty predictions. How do we validate performance?Interesting idea about recognizing new motifs. The structural prototypes from AFLOW seem relevant, since they're going for a set of canonical prototypes IIUC. Some other issues related to validating performance
Comments on the plumbing to modify for structure:The easiest place to start testing things out is via the mat_discover bare bones script. Today, I adapted this to use a matbench elasticity dataset with pymatgen Structures, M3GNet instead of CrabNet, and a Euclidean fingerprint-based structural distance instead of ElMD. Everything else is the same. See the notebook below When your structural distance metric of choice is ready, then that can be swapped out with the fingerprint-based structural distance. After that comes the most difficult part - validation (hence Taylor's comments). Validation can proceed in a similar fashion to the original one and/or include some extensions/modifications to how validation is performed. |
Feature request
Requires valid distance metrics for crystal structures and polymers that encode chemo-structural novelty and polymeric novelty, respectively as well as structure-based regression models. After that, just some basic plumbing.
The text was updated successfully, but these errors were encountered: