Skip to content

Commit

Permalink
Updated to flattened qcio 0.3.0 models
Browse files Browse the repository at this point in the history
  • Loading branch information
coltonbh committed Jul 17, 2023
1 parent 7705bd3 commit 4fb11ef
Show file tree
Hide file tree
Showing 17 changed files with 161 additions and 187 deletions.
2 changes: 2 additions & 0 deletions .vscode/settings.json
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,8 @@
"python.formatting.provider": "black",
"cSpell.words": [
"calcinfo",
"calctype",
"calctypes",
"CUDA",
"Hartree",
"htmlcov",
Expand Down
4 changes: 4 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,10 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),

## [unreleased]

### Changed

- Updated to used `qcio>=0.3.0` flattened models and the `SinglePointResults`object.

## [0.3.1]

### Fixed
Expand Down
51 changes: 15 additions & 36 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ A library for parsing Quantum Chemistry output files into structured data object

## ☝️ NOTE

This package was originally designed to run as a standalone parser to generate `SinglePointSuccessfulOutput` and `SinglePointFailedOutput` objects parsing all input and provenance data in addition to computed output data; however, once [qcop](https://github.com/coltonbh/qcop) was built to power quantum chemistry programs the only parsing needed was for the simpler `SinglePointComputedProperties` values. There are still remnants of the original `parse` function in the repo and I've left them for now in case I find a use for the general purpose parsing.
This package was originally designed to run as a standalone parser to generate `SinglePointOutput` and `ProgramFailure` objects parsing all input and provenance data in addition to computed output data; however, once [qcop](https://github.com/coltonbh/qcop) was built to power quantum chemistry programs the only parsing needed was for the simpler `SinglePointResults` values. There are still remnants of the original `parse` function in the repo and I've left them for now in case I find a use for the general purpose parsing.

## ✨ Basic Usage

Expand All @@ -21,72 +21,51 @@ This package was originally designed to run as a standalone parser to generate `
python -m pip install qcparse
```

- Parse a file into a `SinglePointComputedProperties` object with a single line of code.
- Parse a file into a `SinglePointResults` object with a single line of code.

```python
from qcparse import parse_computed_props
from qcparse import parse_results
computed = parse_computed_props("/path/to/tc.out", "terachem")
results = parse_results("/path/to/tc.out", "terachem")
```

- The `computed` object will be a `SinglePointComputedProperties` object. Run `dir(computed)` inside a Python interpreter to see the various values you can access. A few prominent values are shown here as an example:
- The `results` object will be a `SinglePointResults` object. Run `dir(results)` inside a Python interpreter to see the various values you can access. A few prominent values are shown here as an example:

```python
from qcparse import parse_computed_props
from qcparse import parse_results
computed = parse_computed_props("/path/to/tc.out", "terachem")
results = parse_results("/path/to/tc.out", "terachem")
computed.energy
computed.gradient # If a gradient calc
computed.hessian # If a hessian calc
results.energy
results.gradient # If a gradient calc
results.hessian # If a hessian calc
computed.calcinfo_nmo # Number of molecular orbitals
results.calcinfo_nmo # Number of molecular orbitals
```

- Parsed values can be written to disk like this:

```py
with open("computed.json", "w") as f:
with open("results.json", "w") as f:
f.write(result.json())
```

- And read from disk like this:

```py
from qcio import SinglePointComputedProperties as SPProps
from qcio import SinglePointResults
computed = SPProps.open("myresult.json")
results = SinglePointResults.open("results.json")
```

- You can also run `qcparse` from the command line like this:

```sh
qcparse -h # Get help message for cli
qcparse terachem ./path/to/tc.out > computed.json # Parse TeraChem stdout to json
qcparse terachem ./path/to/tc.out > results.json # Parse TeraChem stdout to json
```

## 🤩 Next Steps

This package is integrated into [qcop](https://github.com/coltonbh/qcop). This means you can use `qcop` to power your QC programs using standard input data structures in pure Python and get back standardized Python output objects.

```python
from qcop import compute
from qcio import Molecule, SinglePointInput
molecule = Molecule.open("mymolecule.xyz")
sp_input = SinglePointInput(
molecule=molecule,
program_args={
"calc_type": "gradient", # "energy" | "gradient" | "hessian"
"model": {"method": "b3lyp", "basis": "6-31gs"},
"keywords": {"restricted": True, "purify": "no"} # Keywords are optional
})
# result will be SinglePointSuccessfulOutput or SinglePointFailedOutput
result = compute(sp_input, "terachem")
```

## 💻 Contributing

If there's data you'd like parsed fromI output files, please open an issue in this repo explaining the data items you'd like parsed and include an example output file containing the data, like [this](https://github.com/coltonbh/qcparse/issues/2).
Expand Down
4 changes: 2 additions & 2 deletions docs/dev-decisions.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@

## UPDATE DESIGN DECISION:

- I don't see a strong reason for making this package a standalone package that parses everything required for a `SinglePointSuccessfulResult` object including input data, provenance data, xyz files, etc... While the original idea was to have a cli tool to run on TeraChem files, now that I've build my own data structures and driver program, there's no reason to parse anything but `SinglePointComputedProperties` values because we should just be driving the programs with `qcop/qcpilot`. So why waste time parsing a bunch of extra data? I've left the original `parse` function and some basic `cli` functionality in case I change my mind, but perhaps I just strip this down to the bare bones and K.I.S.S? The only downside would be walking in to someone else's old data and wanting to slurp it all in, but perhaps there's no reason to build for that use case now... Just go with SIMPLE and keep the code maintainable. All the logic for parsing inputs and handling failed computations was making the package quite complex (cases where .xyz file not available, or determining if output was a success/failure), this should be the SIMPLEST package of the `qc` suite, yet it was become the most complex and difficult to reason about.
- I don't see a strong reason for making this package a standalone package that parses everything required for a `SinglePointSuccessfulResult` object including input data, provenance data, xyz files, etc... While the original idea was to have a cli tool to run on TeraChem files, now that I've build my own data structures and driver program, there's no reason to parse anything but `SinglePointComputedProps` values because we should just be driving the programs with `qcop/qcpilot`. So why waste time parsing a bunch of extra data? I've left the original `parse` function and some basic `cli` functionality in case I change my mind, but perhaps I just strip this down to the bare bones and K.I.S.S? The only downside would be walking in to someone else's old data and wanting to slurp it all in, but perhaps there's no reason to build for that use case now... Just go with SIMPLE and keep the code maintainable. All the logic for parsing inputs and handling failed computations was making the package quite complex (cases where .xyz file not available, or determining if output was a success/failure), this should be the SIMPLEST package of the `qc` suite, yet it was become the most complex and difficult to reason about.

## Basic Architectural Overview and Program Flow

Expand All @@ -21,7 +21,7 @@

1. Create a file in the `parsers` named after the quantum chemistry program, e.g., `qchem.py`.
2. Create `class FileType(str, Enum)` in the file registering the file types the parsers support.
3. If `stdout` is a file type then create a `def get_calc_type(string: str) -> CalcType` function that returns the `CalcType` for the file. One of `CalcType.energy`, `CalcType.gradient`, or `CalcType.hessian`.
3. If `stdout` is a file type then create a `def get_calctype(string: str) -> CalcType` function that returns the `CalcType` for the file. One of `CalcType.energy`, `CalcType.gradient`, or `CalcType.hessian`.
4. Create simple parser functions that accept file data and an output object. The parser should parse a single piece of data from the file and set it on the output object at its corresponding location found on the `qcio.SinglePointOutput` object. Register this parser by decorating it with the `@parser` decorator. The decorator must declare `filetype` and can optionally declare `required` (`True` by default), `input_data` (`False` by default), and `only` (`None` by default). See the `qcparse.decorators` for details on what these mean.

```py
Expand Down
Loading

0 comments on commit 4fb11ef

Please sign in to comment.