Skip to content

Commit

Permalink
docs: Add documentation for ParellelEvolCCM
Browse files Browse the repository at this point in the history
Signed-off-by: jvfe <[email protected]>
  • Loading branch information
jvfe committed May 13, 2024
1 parent 98b7975 commit 2716d98
Show file tree
Hide file tree
Showing 2 changed files with 89 additions and 0 deletions.
88 changes: 88 additions & 0 deletions docs/evolccm.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,88 @@
# ParallelEvolCCM usage

ParallelEvolCCM is a tool for the identification of coordinated gain and loss of features.
The method is described in detail in the following publication:

- [The Community Coevolution Model with Application to the Study of Evolutionary Relationships between Genes Based on Phylogenetic Profiles](https://doi.org/10.1093/sysbio/syac052)

If you use ParallelEvolCCM in your analysis, please cite the above publication.

## ParallelEvolCCM inputs

The ParallelEvolCCM tool requires two inputs:

- A phylogenetic tree in Newick format
- A presence/absence table in TSV format.

The presence/absence TSV must have genome names equal to the ones in the tree in a 'genome_id' column,
with all other columns representing features absent (0) or present (1) in each genome. I.e.:

```
genome_id plasmid_AA155 plasmid_AA161
ED010 0 0
ED017 0 1
ED040 0 0
ED073 0 1
ED075 1 1
ED082 0 1
ED142 0 1
ED178 0 1
ED180 0 0
```

## Using ParallelEvolCCM by itself

The ParallelEvolCCM tool is a command line tool written in R.
It is available through the [bin/ParallelEvolCCM.R](https://github.com/beiko-lab/arete/blob/master/bin/ParallelEvolCCM.R) script.

To download the tool and make it executable, run:

```bash
wget https://raw.githubusercontent.com/beiko-lab/arete/master/bin/ParallelEvolCCM.R
chmod +x ParallelEvolCCM.R
```

Then, ensure all EvolCCM dependencies are installed.
You can install them by running the following command in your R console:

```r
install.packages(c('ape', 'dplyr', 'phytools', 'foreach', 'doParallel', 'gplots', 'remotes'))
remotes::install_github('beiko-lab/evolCCM')
```

You can then run the tool like this:

```bash
./ParallelEvolCCM.R --intree tree.nwk --intable feature_table.tsv.gz --cores -1
```

- `--intree` specifies the phylogenetic tree in Newick format.
- `--intable` specifies the feature table in compressed TSV format.
- `--cores` specifies the number of cores to use. Use `-1` to use all available cores.

Additional parameters can be found by running `./ParallelEvolCCM.R` with no additional parameters.

## Using ParallelEvolCCM with ARETE

The ParallelEvolCCM tool is also made available through the `evolccm` entry in ARETE.
Making it possible to run the tool with Docker or Singularity.

To execute the ParallelEvolCCM tool with ARETE, run the following command:

```bash
nextflow run beiko-lab/ARETE \
-entry evolccm \
--core_gene_tree core_gene_alignment.tre \
--feature_profile feature_profile.tsv.gz \
-profile docker
```

The parameters being:

- `--core_gene_tree` - The reference tree, coming from a core genome alignment,
like the one generated by the `phylo` entry in ARETE.
- `--feature_profile` - A presence/absence TSV matrix of features
in genomes, like the one created in ARETE's `annotation` entry.
- `-profile` - The profile to use. In this case, `docker`.

For more information, check the [full ARETE documentation](https://beiko-lab.github.io/arete/).
1 change: 1 addition & 0 deletions mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@ nav:
- Dataset Size: resource_profiles.md
- Parameters: params.md
- Subsampling: subsampling.md
- ParallelEvolCCM: evolccm.md
repo_url: https://github.com/beiko-lab/arete
theme:
name: "readthedocs"
Expand Down

0 comments on commit 2716d98

Please sign in to comment.