Skip to content

Latest commit

 

History

History
48 lines (32 loc) · 3.81 KB

README.md

File metadata and controls

48 lines (32 loc) · 3.81 KB

Pauling File Distinct Phases

DOI

Background and definition

As known, the standard unit of the Pauling File (MPDS) data is an entry. All the entries are subdivided into three kinds: crystalline structures, physical properties, and phase diagrams. They are called S-, P- and C-entries, correspondingly. All these entries have persistent identifiers (similar to DOIs), e.g. S377634, P600028, or C100027. These three kinds of entries are grouped together into the distinct phases they belong. Consider the following example of entries vs. distinct phases. For the titanium dioxide, there exist the following distinct phases: rutile with the space group 136 (say we, phase_id 1), anatase with the space group 141 (phase_id 2), and brookite with the space group 61 (phase_id 3). Then the crystalline structures (S-entries) and the physical properties (P-entries) for the titanium dioxide will refer to the distinct phases either 1, or 2, or 3, and the phase diagrams (C-entries) for the Ti-O system will ideally contain (i.e. refer to) all the distinct phases 1, 2, and 3 simultaneously.

The term distinct phase is often used in the alloys description, however here we apply it for all the compounds known from the scientific literature. A tremendous work was done by the Pauling File team in the past 30 years to manually distinguish about 200 000 inorganic materials phases, appeared at least once in the literature. Each phase has a unique combination of (a) chemical formula, (b) space group, and (c) Pearson symbol. Each phase has the permanent integer identifier called phase_id. Using the phase id, one can unambiguously link any distinct phase at the MPDS with the URL such as https://mpds.io/phase_id/XXXX, e.g. https://mpds.io/phase_id/27712.

This repository contains the yearly releases of the Pauling File distinct phases for all the known unary and binary compounds. Please contact us if you are interested in the other compounds.

Data structure

The dumps in the release folders are in JSON format and have the following structure:

[
	{
		"id": "https://mpds.io/phase_id/5019",
		"formula": {"short": "Ge", "full": "Ge cub"},
		"spg": 227,
		"pearson": "cF8",
		"entries": 1571,
		"articles": 748
	},
...
]

The field id is the permanent URL of the particular distinct phase at the MPDS platform. Its last integer part is the phase_id. The short and full formula stand for the terse plain-text and detailed HTML description of the chemical composition, respectively. The spg is the space group number. The pearson is the Pearson symbol (note its numeric part which is a number of atoms in the standard crystalline unit cell). The entries is the number of the entries at the MPDS platform at the year of the release. Note these are only the peer-reviewed class of entries, not the machine learning or ab initio calculations. In the MPDS API the peer-reviewed class is referenced by dtype parameter equal to 1 (or MPDSDataTypes.PEER_REVIEWED). In the MPDS GUI this is given by the search keyword peer-reviewed. The articles is the number of peer-reviewed literature sources containing the particular distinct phase processed by the Pauling File team to the year of the release.

Copyright and license

Copyright 2022 Materials Phases Data System (Switzerland), NIMS Government Agency (Japan), and Materials Platform for Data Science (Estonia).

All rights reserved.

Academic usage is allowed.

Please contact us if you would like to use these data in the for-profit purposes or if you are interested in the full dataset.