Skip to content

5 Scripts help page

Matteopaluh edited this page Nov 15, 2023 · 2 revisions

set_kemet_working-directory.py help

usage: set_kemet_working-directory.py [-h] [-k] [-u] [-G]

    Base command for setting KEMET package working directory.
    Create folders and instruction files; helper function to manage KEGG MODULE .kk database.
    

optional arguments:
  -h, --help           show this help message and exit
  -k, --set_kk_DB      
                                               Choose this option to generate KEGG Module DB (.kk files),
                                               in order to perform KEGG Modules Completeness evaluation.
                                               Default: already generated
  -u, --update_kk_DB   
                                               Choose this option to update already existing KEGG Module DB (.kk files).
  -G, --gapfill_usage  
                                               Choose this option to create required folders for the GSMM Gapfilling,
                                               follow-up of the HMM search procedures.

add_taxonomy_from_gtdb-tk.py help

usage: add_taxonomy_from_gtdb-tk.py [-h] -i ADD_GENOMES_INSTRUCTION_FILE -t
                                    ADD_GTDB_TO_NCBI_OUTPUT -f
                                    {.fa,.fna,.fasta} [-v]

    Add necessary taxonomy informations of MAGs/Genomes of interest for KEMET HMM and GSMM analyses.
    Use this after the GTDB-tk "gtdb_to_ncbi_majority_vote.py" script (that converts from GTDB taxonomy to NCBI),
    to further convert to KEGG BRITE taxonomy.
    This script will include info on the MAGs/Genomes indicated in the output file from the aforementioned script.
    IMPORTANT:

    The automatic taxonomy conversion has notable exceptions, such as "Candidate" phyla,
    as well as other phyla lacking sufficient (3+) KEGG Organism representatives.
    

optional arguments:
  -h, --help            show this help message and exit
  -i ADD_GENOMES_INSTRUCTION_FILE, --add_genomes_instruction_file ADD_GENOMES_INSTRUCTION_FILE
                        Include the relative path to KEMET "genomes.instruction" file.
  -t ADD_GTDB_TO_NCBI_OUTPUT, --add_gtdb_to_ncbi_output ADD_GTDB_TO_NCBI_OUTPUT
                        Include the relative path to "gtdb_to_ncbi_majority_vote.py" output file.
  -f {.fa,.fna,.fasta}, --fasta_extension {.fa,.fna,.fasta}
                        Complete "genomes.instruction" file names with the indicated extension.
  -v, --verbose         Print more informations - for debug or log.

kemet.py help

usage: kemet.py [-h] -a {eggnog,kaas,kofamkoala} [--update_taxonomy_codes]
                [-I PATH_INPUT] [-k] [-n] [--skip_hmm]
                [--hmm_mode {onebm,modules,kos}]
                [--threshold_value THRESHOLD_VALUE] [--skip_nt_download]
                [--skip_msa_and_hmmbuild] [--retry_nhmmer] [--skip_gsmm]
                [--gsmm_mode {existing,denovo}] [-O PATH_OUTPUT] [-v] [-q]
                [--log]
                FASTA_file

KEMET - KEGG Module Evaluation Tool:
1) Evaluate KEGG Modules Completeness for given genomes.
2) HMM-based check for ortholog genes (KO) of interest after KEGG Module Completeness evaluation.
3) Genome-scale model gapfill with nucleotidic HMM-derived evidence, for KOs of interest.

positional arguments:
  FASTA_file            Genome/MAG FASTA file as indicated in the "genomes.instruction" -
                        points to files (in "KEGG_annotations") comprising KO annotations, associated with each gene.

optional arguments:
  -h, --help            show this help message and exit
  -a {eggnog,kaas,kofamkoala}, --annotation_format {eggnog,kaas,kofamkoala}
                        
                        Format of KO_list.
                        
                        eggnog: 1 gene | many possible annotations;
                        kaas: 1 gene | 1 annotation at most;
                        kofamkoala: 1 gene | many possible annotations
  --update_taxonomy_codes
                        Update taxonomy filter codes - WHEN TO USE: after downloading a new BRITE taxonomy with "set_kemet_working-directory.py".
  -I PATH_INPUT, --path_input PATH_INPUT
                        Absolute path to input file(s) FOLDER.
  -k, --as_kegg         Return KEGG-Mapper output for the Module Completeness evaluation.
  -n, --no_genome       Avoid checking for MAG/genome FASTA file and only use annotations for Modules Completeness evaluation..
  --skip_hmm            Skip HMM-driven search for KOs & stop after KEGG Modules Completeness evaluation.
  --hmm_mode {onebm,modules,kos}
                        
                        Choose the subset of KOs of interest for HMM-based check.
                        By default, the KOs already present in the functional annotation are not checked further.
                        
                        onebm: search for KOs from KEGG Modules missing 1 block;
                        modules: search for KOs from the KEGG Modules indicated in the "module_file.instruction" file, 1 per line
                            (e.g. Mxxxxx);
                        kos: search for KOs indicated in the "ko_file.instruction" file, 1 per line
                            (e.g. Kxxxxx)
  --threshold_value THRESHOLD_VALUE
                        Define a threshold for the corrected score resulting from HMM-hits, which is indicative of good quality.
  --skip_nt_download    Skip downloading KEGG KOs nt sequences.
  --skip_msa_and_hmmbuild
                        Skip MAFFT and HMMER hmmbuild commands.
  --retry_nhmmer        Move HMM-files and re-run nHMMER command.
  --skip_gsmm           Skip GSMM operations, gapfill or de-novo model creation, & stop after HMM-driven search for KOs.
  --gsmm_mode {existing,denovo}
                        
                        Choose the methods of GSMM operation.
                        (This method won't be performed if "--hmm_mode kos" was chosen)
                        existing: use pre-existing CarveMe GSMM to add reactions content connected to HMM-derived KOs;
                        denovo: generate a new CarveMe GSMM, performing gene prediction and adding HMM-derived hits from the chosen HMM-mode.
  -O PATH_OUTPUT, --path_output PATH_OUTPUT
                        Absolute path to ouput file(s) FOLDER.
  -v, --verbose         Print more informations - for debug and progress.
  -q, --quiet           Silence soft-errors (for MAFFT and HMMER commands).
  --log                 Store KEMET commands and progress during the execution in a log file.