Skip to content
/ GradME Public

Leaping through tree space: continuous phylogenetic inference for rooted and unrooted trees

License

Notifications You must be signed in to change notification settings

Neclow/GradME

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Leaping through tree space: continuous phylogenetic inference for rooted and unrooted trees

This repo hosts a minimal implementation of GradME

  • bme_jax/: Balanced Minimum Evolution and distance-based optimisation with Phylo2Vec in Jax
  • cfg/: Example configuration files
  • utils/: Utility functions for manipulation of sequence and tree data.

Environment setup

  1. Install R (version used here: 4.2.2) if needed. The latest version of R should also work.
  2. Setup the gradme environment using conda/mamba and activate the environment:
conda env create -f env.yml
conda activate gradme
  1. Optional: if you have GPUs/TPUs, you might need to update your installation of Jax. Follow the instructions at https://github.com/google/jax
  2. Install phangorn in R (4.2.2 or above):
install.packages("phangorn")

Accessing data

The following datasets were used:

Dataset Sites Taxa Type Taxonomic rank Access TreeBASE ID
DS1 1,949 27 rRNA (18S) Tetrapods [1] M2017
DS2 2,520 29 rRNA (18S) Acanthocephalans [1] M2131
DS3 1,812 36 mtDNA Mammals; mainly Lemurs [1] M127
DS4 1,137 41 rDNA (18S) Fungi; mainly Ascomycota [1] M487
DS5 378 50 DNA Lepidoptera [1] M2907
DS6 1,133 50 rDNA (28S) Fungi; mainly Diaporthales [1] M220
DS7 1,824 59 mtDNA Mammals; mainly Lemurs [1] M2449
DS8 1,008 64 rDNA (28S) Fungi; mainly Hypocreales [1] M2261
DS9 955 67 DNA Poaecae (grasses) [1] M2389
DS10 1,098 67 DNA Fungi; mainly Ascomycota [1] M2152
DS11 1,082 71 DNA Lichen [1] M2274
Eutherian 1,338,678 37 DNA Eutherian Mammals [2]
Jawed 1,460-18,406 99 AA Gnathostomata (jawed vertebrates) [3]
Primates 232 14 mtDNA Mammals; mainly Primates [4]

DS1-DS8 are also available at: https://github.com/zcrabbit/vbpi-gnn/tree/main/data/hohna_datasets_fasta DS1-DS11 should also be available on TreeBASE using the TreeBASE, but the site was down on June 9, 2023.

Sources: see manuscript

Running GradME

  1. Download the datasets (in the FASTA format) mentioned above and place them in a data/ folder (e.g., in the repo)
  2. Update the configuration file cfg/bme_config_v3.yml, especially fasta_path (the path to the FASTA file you want to analyse)
    • (You can also create your own configuration file based on the given template)
  3. Run the main optimisation script: python -m bme_jax.main --config-path cfg/name_of_your_config_file.yml or use the demo.ipynb notebook

About

Leaping through tree space: continuous phylogenetic inference for rooted and unrooted trees

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published