Skip to content

olga24912/Nerpa

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

NRP Matcher

NRP Matcher is a tool which links gene cluster to known natural products.

Dependencies

NRP Matcher requires a 64-bit Linux system or MAC OS and Python 3, g++ (version 5.2 or higher) and cmake (version 3.5 or higher) to be pre-installed on it.

Also https://github.com/ablab/dereplicator/ must be added to PATH. Folders "Fragmentation_rule" and "configs" must be located in one of the paths:

  • <DEREPLICATOR INSTALL DIR>/
  • <DEREPLICATOR INSTALL DIR>/../
  • <DEREPLICATOR INSTALL DIR>/../../
  • <DEREPLICATOR INSTALL DIR>/../share/
  • <DEREPLICATOR INSTALL DIR>/../share/npdtools/

Installation

To compile NRP Matcher you can download the NRP Matcher source code:

git clone https://github.com/olga24912/NRPsMatcher.git
cd NRPsMatcher

and build it with the following script:

./install.sh

NRP Matcher will be built in the directory ./bin. If you wish to install NRP Matcher into another directory, you can specify full path of destination folder by running the following command in bash or sh:

PREFIX=<destination_dir> ./install.sh

for example:

PREFIX=/usr/local ./install.sh

which will install NRP Matcher into /usr/local/bin.

Note: you should use absolute path for <destination_dir>.

After installation you will get NRPsMatcher and run_nrp_matcher.py files in ./bin (or <destination_dir>/bin if you specified PREFIX) directory.

We also suggest adding NRP Matcher installation directory to PATH variable.

Running

Input

NRP Matcher takes as input file with list of paths to gene cluster prediction files and file with paths to files with NRP structure in MOL format.

Predictions

TODO: нужен скрипт и бинарник antismash что бы получить нужное предсказание. Или... долго описывать как нужный файлик получать. Или вообще передавать список геномов и самостоятельно запускать antismash...

NRPs structures

By using command line interface you need specify info file, where each line described one NRP in following format:

<path to file with NRP structure in MOL format> <any extra information about NRP>

for example:

streptomedb/streptomedb.1.mol geranylphenazinediol 348.184 1

You can read about MOL format here.

If you have NRP structure in some other format we recommend use molconvert. To convert smile string to required MOL file you can run:

molconvert mol:V3+H --smiles <smile string> -o <nrp file>

Example of info and mols file you can find in

<installing_dir>/share/library.info.streptomedb
<installing_dir>/share/streptomedb/

Command line options

To run NRP Matcher from the command line type

python3 run_nrp_matcher.py [options]

Options

-h (or --help) Print help

-p (or --predictions) <file_name> File with paths to prediction files. Required option.

--lib_info <file_name> File with paths to nrp structure description files in MOL format. Required option.

-o (or --local_output_dir) <output_dir> Specify the output directory.

Output

NRP Matcher stores all output files in current directory or in <output_dir if it set by the user.

Output files:

  • <output_dir>/reports.csv description of matched pairs(nrp structure - gene cluster prediction). Each line describes one matched pair.
  • <output_dir>/details_mol folder which contains files with detail description of matched predictions for each NRP structure file.
  • <output_dir>/details_prediction folder which contains files with detail description of matched NRP structure for each prediction file.

Example

To test NRP Matcher you can run the following command from NRP Matcher bin directory:

python3 run_nrp_matcher.py -p ../share/prediction.info --lib_info ../share/library.info.streptomedb -o test