reacnetgenerator package#

ReacNetGenerator is an automatic reaction network generator for reactive molecular dynamics simulation.[Rb972b5c58b34-1]_.

References#

1

Jinzhe Zeng, Liqun Cao, Chih-Hao Chin, Haisheng Ren, John Z. H. Zhang, Tong Zhu, ReacNetGenerator: an automatic reaction network generator for reactive molecular dynamic simulations, Phys. Chem. Chem. Phys., 2020, 22 (2): 683-691, doi: 10.1039/C9CP05091D.

class reacnetgenerator.ReacNetGenerator(**kwargs: Any)[source]#

Bases: object

Use ReacNetGenerator for trajectory analysis.

Parameters
inputfiletype: str
The type of the input file. The following type is allowed:
inputfilename: str or list of strs

The filename(s) of the input file, which can be either relative path or absolute path. If it is a list, the files will be read in order.

atomname: tuple of strs

The list of the atomic names in the input file, such as (‘C’, ‘H’, ‘O’). It should match the order of that in the input file.

runHMM: bool, optional, default: True

Process trajectory with Hidden Markov Model (HMM) or not. If the user find too many species are filtered, they can turn off this option.

miso: int, optional, default: 0

Merge the isomers and the highest frequency is used as the representative. 0, off two available levels: 1, merge the isomers with same atoms and same bond-network but different bond levels; 2, merge the isomers with same atoms with different bond-network.

pbc: bool, optional, default: True

Use periodic boundary conditions (PBC) or not.

cell: (3,3) array_like or (3,) array_like or (9,) array_like, optional, default: None

The cell (box size) of the system. If None (default), the cell will be read from the input file. If the input file doesn’t have cell information, this parameter will be necessary.

nproc: int, optional, default: None

The number of processors used for analysis. If None (default), the program will try to use all processors.

selectatoms: str, optional, default: None

Select an element from the atomic names, such as C, and only show species with this element in the reaction network. If None (default), the network will show all elements.

split: int, optional, default: None

Split number for the time axis. For example, if set to 10, the whole trajectroy will be divided into 10 parts and reactions of each part will be shown.

a: (2,2) array_like, optional, default: [[0.999, 0.001], [0.001, 0.009]]

Transition matrix A of HMM parameters. It is recommended for users to choose their own parameters. See the paper for details.

b: (2,2) array_like, optional, default: [[0.6, 0.4], [0.4, 0.6]]

Emission matrix B of HMM parameters. It is recommended for users to choose their own parameters. See the paper for details.

Examples

>>> from reacnetgenerator import ReacNetGenerator
 >>> rng=ReacNetGenerator(inputfiletype="dump", inputfilename="dump.ch4", atomname=['C', 'H', 'O'])
 >>> rng.runanddraw()
-
class Status(value)[source]#

Bases: Enum

ReacNetGenerator status.

The ReacNetGenerator consists of several modules and algorithms to process the information from the given trajectory, including:

  • DOWNLOAD: Download trajectory from urls

  • DETECT: Read bond information and detect molecules

  • HMM: HMM filter

  • MISO: Merge isomers

  • PATH: Indentify isomers and collect reaction paths

  • MATRIX: Reaction matrix generation

  • NETWORK: Draw reaction network

  • REPORT: Generate analysis report

DETECT = 'Read bond information and detect molecules'#
DOWNLOAD = 'Download trajectory'#
HMM = 'HMM filter'#
INIT = 'Init'#
MATRIX = 'Reaction matrix generation'#
MISO = 'Merge isomers'#
NETWORK = 'Draw reaction network'#
PATH = 'Indentify isomers and collect reaction paths'#
REPORT = 'Generate analysis report'#
draw() None[source]#

Draw the reaction network, i.e. NETWORK step.

hmmfilename: str#
moleculetemp2filename: str#
moleculetempfilename: str#
originfilename: str#
report() None[source]#

Generate the analysis report, i.e. REPORT step.

resultfilename: str#
run() None[source]#

Process MD trajectory, including DOWNLOAD, DETECT, HMM, PATH, and MATRIX steps.

runanddraw(run: bool = True, draw: bool = True, report: bool = True) None[source]#

Analyze the trajectory from MD simulation.

Parameters
runbool, optional, default: True

Process the trajectory or not, including DOWNLOAD, DETECT, HMM, PATH, and MATRIX steps.

drawbool, optional, default: True

Draw the reaction network or not, i.e. NETWORK step.

reportbool, optional, default: True

Generate the analysis report, i.e. NETWORK step.

urls: dict#

Submodules#

reacnetgenerator.commandline module#

Command line interface for reacnetgenerator.

reacnetgenerator.commandline.main_parser() ArgumentParser[source]#

Return main parser.

Returns
argparse.ArgumentParser

reacnetgenerator cli parser

reacnetgenerator.commandline.parm2cmd(pp: dict) List[str][source]#

Convert a parameter dictionary to command line arguments.

Parameters
ppdict

Parameter dictionary

Returns
List[str]

Command line arguments

reacnetgenerator.dps module#

Connect molecule with Depth-First Search.

reacnetgenerator.dps.dps(bonds, levels)#

Connect molecule with Depth-First Search.

Parameters
bondslist of list

The bonds of molecule.

levelslist of int

The levels of atoms.

Returns
list of list

The connected atoms in each molecule.

list of list

The connected bonds in each molecule.

reacnetgenerator.dps.dps_reaction(reactdict)#

Find A+B->C+D reactions.

Parameters
reactdictlist of dict of list

Two dictionaries of reactions.

Returns
list of list of list

List of reactions. The secons axis matches the position of species (left or right).

reacnetgenerator.gui module#

GUI version of ReacNetGenerator.

class reacnetgenerator.gui.GUI[source]#

Bases: object

GUI class.

gui()[source]#

Start the GUI.

property root#

Return the root of the GUI.

reacnetgenerator.gui.gui()[source]#

Open GUI version of ReacNetGenerator.

reacnetgenerator.reacnetgen module#

The main module of ReacNetGenerator.

reacnetgenerator.tools module#

Useful methods to futhur process ReacNetGenerator results.

reacnetgenerator.tools.calculate_rate(specfile: Union[str, Path], reacfile: Union[str, Path], cell: ndarray, timestep: float) Dict[str, float][source]#

Calculate the rate constant of each reaction.

The rate constants are calculated by the method developed in [1]. The time interval of the trajectory is assumed to be uniform.

Parameters
specfilestr

The species file.

reacfilestr

The reactions file.

cellnp.ndarray

The cell with the shape (3, 3). Unit: Angstrom.

timestepfloat

The time step. Unit: femtosecond.

Returns
ratesDict[str, float]

The rate of each reaction. The dict key is the reaction SMILES. The value is in unit of [(cm^3/mol)^(n-1)s^(-1)], where n is the reaction order.

References

1

Yanze Wu, Huai Sun, Liang Wu, Joshua D. Deetz, Extracting the mechanisms and kinetic models of complex reactions from atomistic simulation data, J. Comput. Chem. 40, 16, 1586-1592.

Examples

>>> cell = np.eye(3) * 3.7601e1 # in unit Angstrom
+
class Status(value)[source]#

Bases: Enum

ReacNetGenerator status.

The ReacNetGenerator consists of several modules and algorithms to process the information from the given trajectory, including:

  • DOWNLOAD: Download trajectory from urls

  • DETECT: Read bond information and detect molecules

  • HMM: HMM filter

  • MISO: Merge isomers

  • PATH: Indentify isomers and collect reaction paths

  • MATRIX: Reaction matrix generation

  • NETWORK: Draw reaction network

  • REPORT: Generate analysis report

DETECT = 'Read bond information and detect molecules'#
DOWNLOAD = 'Download trajectory'#
HMM = 'HMM filter'#
INIT = 'Init'#
MATRIX = 'Reaction matrix generation'#
MISO = 'Merge isomers'#
NETWORK = 'Draw reaction network'#
PATH = 'Indentify isomers and collect reaction paths'#
REPORT = 'Generate analysis report'#
draw() None[source]#

Draw the reaction network, i.e. NETWORK step.

hmmfilename: str#
moleculetemp2filename: str#
moleculetempfilename: str#
originfilename: str#
report() None[source]#

Generate the analysis report, i.e. REPORT step.

resultfilename: str#
run() None[source]#

Process MD trajectory, including DOWNLOAD, DETECT, HMM, PATH, and MATRIX steps.

runanddraw(run: bool = True, draw: bool = True, report: bool = True) None[source]#

Analyze the trajectory from MD simulation.

Parameters
runbool, optional, default: True

Process the trajectory or not, including DOWNLOAD, DETECT, HMM, PATH, and MATRIX steps.

drawbool, optional, default: True

Draw the reaction network or not, i.e. NETWORK step.

reportbool, optional, default: True

Generate the analysis report, i.e. NETWORK step.

urls: dict#

Submodules#

reacnetgenerator.commandline module#

Command line interface for reacnetgenerator.

reacnetgenerator.commandline.main_parser() ArgumentParser[source]#

Return main parser.

Returns
argparse.ArgumentParser

reacnetgenerator cli parser

reacnetgenerator.commandline.parm2cmd(pp: dict) List[str][source]#

Convert a parameter dictionary to command line arguments.

Parameters
ppdict

Parameter dictionary

Returns
List[str]

Command line arguments

reacnetgenerator.dps module#

Connect molecule with Depth-First Search.

reacnetgenerator.dps.dps(bonds, levels)#

Connect molecule with Depth-First Search.

Parameters
bondslist of list

The bonds of molecule.

levelslist of int

The levels of atoms.

Returns
list of list

The connected atoms in each molecule.

list of list

The connected bonds in each molecule.

reacnetgenerator.dps.dps_reaction(reactdict)#

Find A+B->C+D reactions.

Parameters
reactdictlist of dict of list

Two dictionaries of reactions.

Returns
list of list of list

List of reactions. The secons axis matches the position of species (left or right).

reacnetgenerator.gui module#

GUI version of ReacNetGenerator.

class reacnetgenerator.gui.GUI[source]#

Bases: object

GUI class.

gui()[source]#

Start the GUI.

property root#

Return the root of the GUI.

reacnetgenerator.gui.gui()[source]#

Open GUI version of ReacNetGenerator.

reacnetgenerator.reacnetgen module#

The main module of ReacNetGenerator.

reacnetgenerator.tools module#

Useful methods to futhur process ReacNetGenerator results.

reacnetgenerator.tools.calculate_rate(specfile: Union[str, Path], reacfile: Union[str, Path], cell: ndarray, timestep: float) Dict[str, float][source]#

Calculate the rate constant of each reaction.

The rate constants are calculated by the method developed in [1]. The time interval of the trajectory is assumed to be uniform.

Parameters
specfilestr

The species file.

reacfilestr

The reactions file.

cellnp.ndarray

The cell with the shape (3, 3). Unit: Angstrom.

timestepfloat

The time step. Unit: femtosecond.

Returns
ratesDict[str, float]

The rate of each reaction. The dict key is the reaction SMILES. The value is in unit of [(cm^3/mol)^(n-1)s^(-1)], where n is the reaction order.

References

1

Yanze Wu, Huai Sun, Liang Wu, Joshua D. Deetz, Extracting the mechanisms and kinetic models of complex reactions from atomistic simulation data, J. Comput. Chem. 40, 16, 1586-1592.

Examples

>>> cell = np.eye(3) * 3.7601e1 # in unit Angstrom
 >>> timestep = 0.1 # in unit fs
 >>> rates = calculate_rate('methane.species', 'methane.reactionabcd', cell, timestep)
-
reacnetgenerator.tools.read_reactions(reacfile: Union[str, Path]) List[Tuple[int, Counter, str]][source]#

Read reactions from the reactions file (ends with .reaction or .reactionsabcd).

For accuracy, HMM filter should be disabled.

Parameters
reacfilestr or Path

The reactions file.

Returns
occsList[Tuple[int, Counter, str]]

The number of occurences of each reaction. The tuple is (occurence, counter_reactants, reaction).

reacnetgenerator.tools.read_species(specfile: Union[str, Path]) Tuple[ndarray, Dict[str, ndarray]][source]#

Read species from the species file (ends with .species).

For accuracy, HMM filter should be disabled.

Parameters
specfilestr or Path

The species file.

Returns
step_idxnp.ndarray

The index of the step.

n_speciesDict[str, np.ndarray]

The number of species in each step. The dict key is the species SMILES.

Examples

Plot the number of methane in each step.

>>> from reacnetgenerator.tools import read_species
+
reacnetgenerator.tools.read_reactions(reacfile: Union[str, Path]) List[Tuple[int, Counter, str]][source]#

Read reactions from the reactions file (ends with .reaction or .reactionsabcd).

For accuracy, HMM filter should be disabled.

Parameters
reacfilestr or Path

The reactions file.

Returns
occsList[Tuple[int, Counter, str]]

The number of occurences of each reaction. The tuple is (occurence, counter_reactants, reaction).

reacnetgenerator.tools.read_species(specfile: Union[str, Path]) Tuple[ndarray, Dict[str, ndarray]][source]#

Read species from the species file (ends with .species).

For accuracy, HMM filter should be disabled.

Parameters
specfilestr or Path

The species file.

Returns
step_idxnp.ndarray

The index of the step.

n_speciesDict[str, np.ndarray]

The number of species in each step. The dict key is the species SMILES.

Examples

Plot the number of methane in each step.

>>> from reacnetgenerator.tools import read_species
 >>> import matplotlib.pyplot as plt
 >>> step_idx, n_species = read_species('methane.species')
 >>> plt.plot(step_idx, n_species['[H]C([H])([H])[H]'])
 >>> plt.savefig("methane.svg")
-

reacnetgenerator.utils module#

Provide utils for ReacNetGenerator.

class reacnetgenerator.utils.SCOUROPTIONS[source]#

Bases: object

Scour (SVG optimization) options.

enable_viewboxing = True#
newlines = False#
remove_descriptions = True#
remove_descriptive_elements = True#
remove_metadata = True#
remove_titles = True#
shorten_ids = True#
strip_comments = True#
strip_ids = True#
strip_xml_prolog = True#
strip_xml_space_attribute = True#
class reacnetgenerator.utils.SharedRNGData(rng: reacnetgenerator.ReacNetGenerator, usedRNGKeys: List[str], returnedRNGKeys: List[str], extraNoneKeys: Optional[List[str]] = None)[source]#

Bases: object

Share ReacNetGenerator data with a class of the submodule.

Parameters
rng: reacnetgenerator.ReacNetGenerator

The centered ReacNetGenerator class.

usedRNGKeys: list of strs

Keys that needs to pass from ReacNetGenerator class to the submodule.

returnedRNGKeys: list of strs

Keys that needs to pass from the submodule to ReacNetGenerator class.

extraNoneKeys: list of strs, optional, default: None

Set keys to None, which will be used in the submodule.

returnkeys() None[source]#

Return back keys to ReacNetGenerator class.

class reacnetgenerator.utils.WriteBuffer(f: IO, linenumber: int = 1200, sep: Optional[AnyStr] = None)[source]#

Bases: object

Store a buffer for writing files.

It is expensive to write to a file, so we need to make a buffer.

Parameters
f: fileObject

The file object to write.

linenumber: int, default: 1200

The number of contents to store in the buffer. The buffer will be flushed if it exceeds the set number.

sep: str or bytes, default: None

The separator for contents. If None (default), there will be no separator.

append(text: AnyStr) None[source]#

Append a text.

Parameters
textstr or bytes

The text to be appended.

check() None[source]#

Check if the number of stored contents exceeds.

If so, the buffer will be flushed.

extend(text: Iterable) None[source]#

Extend texts.

Parameters
textlist of strs or bytes

Texts to be extended.

flush() None[source]#

Flush the buffer.

reacnetgenerator.utils.appendIfNotNone(f: Union[WriteBuffer, ExitStack], wbytes: Optional[AnyStr]) None[source]#

Append a line to a file if the line is not None.

Parameters
fWriteBuffer

The file to write.

wbytesstr or bytes

The line to write.

reacnetgenerator.utils.bytestolist(x: bytes) Any[source]#

Convert a compressed line to an object.

Parameters
xbytes

The compressed line.

Returns
object

The decompressed object.

reacnetgenerator.utils.checksha256(filename: str, sha256_check: Union[str, List[str]])[source]#

Check sha256 of a file is correct.

Parameters
filenamestr

The filename.

sha256_checkstr or list of strs

The sha256 to be checked.

Returns
bool

Indicate whether sha256 is correct.

reacnetgenerator.utils.compress(x: Union[str, bytes]) bytes[source]#

Compress the line.

This function reduces IO overhead to speed up the program. The functions will use lz4 to compress, since lz4 has better performance that any others.

The compressed format is size + data + size + data + …, where size is a 64-bit little-endian integer.

Parameters
xstr or bytes

The line to compress.

Returns
bytes

The compressed line, with a linebreak in the end.

reacnetgenerator.utils.decompress(x: bytes, isbytes: bool = False) Union[str, bytes][source]#

Decompress the line.

Parameters
xbytes

The line to decompress.

isbytesbool, optional, default: False

If the decompressed content is bytes. If not, the line will be decoded.

Returns
str or bytes

The decompressed line.

async reacnetgenerator.utils.download_file(urls: Union[str, List[str]], pathfilename: str, sha256: str) str[source]#

Download files from remote urls if not exists.

Parameters
urls: str or list of strs

The url(s) that is available to download.

pathfilename: str

The downloading path of the file.

sha256: str

Sha256 of the file. If not None and match the file, the download will be skiped.

Returns
pathfilename: str

The downloading path of the file.

reacnetgenerator.utils.download_multifiles(urls: List[dict]) None[source]#

Download multiple files from dicts.

Parameters
urlslist of dicts
The information of download files. Each dict should contain the following key:
  • url: str or list of strs

    The url(s) that is available to download.

  • pathfilename: str

    The downloading path of the file.

  • sha256: str, optional, default: None

    Sha256 of the file. If not None and match the file, the download will be skiped.

async reacnetgenerator.utils.gather_download_files(urls: List[dict]) None[source]#

Asynchronously download files from remote urls if not exists.

See download_multifiles function for details.

reacnetgenerator.utils.listtobytes(x: Any) bytes[source]#

Convert an object to a compressed line.

Parameters
xobject

The object to convert, such as numpy.ndarray.

Returns
bytes

The compressed line.

reacnetgenerator.utils.listtostirng(l: Union[str, list, tuple, ndarray], sep: Union[List[str], Tuple[str, ...]]) str[source]#

Convert a list to string, that is easier to store.

Parameters
lstr or array-like

The list to convert, which can contain any number of dimensions.

seplist of strs

The seperators for each dimension.

Returns
str

The converted string.

reacnetgenerator.utils.multiopen(pool: multiprocessing.pool.Pool, func: Callable, l: IO, semaphore: Optional[multiprocessing.synchronize.Semaphore] = None, nlines: Optional[int] = None, unordered: bool = True, return_num: bool = False, start: int = 0, extra: Optional[Any] = None, interval: Optional[int] = None, bar: bool = True, desc: Optional[str] = None, unit: str = 'it', total: Optional[int] = None) Iterable[source]#

Return an interated object for process a file with multiple processors.

Parameters
poolmultiprocessing.Pool

The pool for multiprocessing.

funcfunction

The function to process lines.

lFile object

The file object.

semaphoremultiprocessing.Semaphore, optional, default: None

The semaphore to acquire. If None (default), the object will be passed without control.

nlinesint, optional, default: None

The number of lines to pass to the function each time. If None (default), only one line will be passed to the function.

unorderedbool, optional, default: True

Whether the process can be unordered.

return_numbool, optional, default: False

If True, adds a counter to an iterable.

startint, optional, default: 0

The start number of the counter.

extraobject, optional, default: None

The extra object passed to the item.

intervalint, optional, default: None

The interval of items that will be passed to the function. For example, if set to 10, a item will be passed once every 10 items and others will be dropped.

barbool, optional, default: True

If True, show a tqdm bar for the iteration.

descstr, optional, default: None

The description of the iteration shown in the bar.

unitstr, optional, default: it

The unit of the iteration shown in the bar.

totalint, optional, default: None

The total number of the iteration shown in the bar.

Returns
object

An object that can be iterated.

reacnetgenerator.utils.must_be_list(obj: Union[Any, List[Any]]) List[Any][source]#

Convert a object to a list if the object is not a list.

Parameters
objObject

The object to convert.

Returns
obj: list

If the input object is not a list, returns a list that only contains that object. Otherwise, returns that object.

reacnetgenerator.utils.produce(semaphore: multiprocessing.synchronize.Semaphore, plist: Iterable[Any], parameter: Any) Generator[Tuple[Any, Any], None, None][source]#

Item producer with a semaphore.

Prevent large memory usage due to slow IO.

Parameters
semaphoremultiprocessing.Semaphore

The semaphore to acquire.

plistlist of objects

The list of items to be passed.

parameterobject

The parameter yielded with each item.

Yields
item: object

The item to be yielded.

parameter: object

The parameter yielded with each item.

reacnetgenerator.utils.read_compressed_block(f: BinaryIO) Generator[bytes, None, None][source]#

Read compressed binary file, assuming the format is size + data + size + data + …

Parameters
ffileObject

The file object to read.

Yields
data: bytes

The compressed block.

reacnetgenerator.utils.run_mp(nproc: int, **kwargs: Any) Iterable[Any][source]#

Process a file with multiple processors.

Parameters
nprocint

The number of processors to be used.

**kwargsdict, optional

Other parameters can be found in the multiopen method.

Yields
object

The yielded object from the multiopen method.

See also

multiopen

reacnetgenerator.utils_np module#

reacnetgenerator.utils_np.check_zero_signal(signal)#

Check if the given signal contains only zeros.

Parameters
signal1D array

The signal to check.

Returns
bool

True if the signal contains only zeros, False otherwise.

Notes

This method uses Cython to speed up the computation.

Benchmark for 1,000,000 loops (1000/2000):
  • Cython: 1.45 s

  • Python: 3.67 s

reacnetgenerator.utils_np.idx_to_signal(idx, step)#

Converts an index array to a signal array.

Parameters
idxarray_like

Index array.

stepint

Step size.

Returns
signalndarray

Signal array.

Notes

This method uses Cython to speed up the computation.

Benchmark for 1,000,000 loops (step=250000):
  • Cython: 18.61 s

  • Python: 31.30 s