SafePathsRNAPC

This repository contains:

The implementation of the computation of Safe Paths under different models of Path Cover in a Directed Acyclic Graph (DAG). C++ code.
Experimental evaluation of the algorithm by computing contigs in a RNA Transcript Assembly problem. Jupyter Notebook.

A comprehensive explanation of both parts can be found in https://doi.org/10.1109/TCBB.2021.3131203.

C++ algorithm

First clone the repo:

 git clone https://github.com/algbio/SafePathsRNAPC.git

This project is a CMake project. To build this project with some runnables you should do

cd SafePathsRNAPC
mkdir build
cd build
cmake ..
cmake .. # Issue: second cmake necessary to compile external library
make

This C++ project downloads the LEMON graph library, which is stored in a Mercurial repository. As such, the installation requires Mercurial.

Jupyter Notebooks

After compiling the C++ code you can replicate our experiments by running the Jupyter Notebooks in the folder data. These notebooks create intermediate files in the different subfolders of data. The notebooks are self-contained and must be run in the following order (indicated in the notebooks too):

data_manipulation/graph_creation.ipynb
experiments/run_experiments.ipynb
evaluation/compute_metrics.ipynb
evaluation/compute_tables.ipynb

These notebooks correspond to the experiments for Homo sapiens. The experiments for other species can be found (following the same structure) in the folders data/mouse (Mus musculus), data/triticum_aestivum, data/hordeum_vulgare, data/fruit_fly (Drosophila melanogaster) and data/magnaporthe_oryzae.

Once all these notebooks have been run, you can run the notebook compute_summary_tables.ipynb.

The package requirements to run our jupyter notebooks can be found at data/requirements.txt

The datasets used are two BED files (per dataset) created from GFF files in the Enssembl project. The script used to transform the GFF into BED files can be found at data/scripts/gtf2bed.py

Contact

Any error, improvement or suggestion you can write me to elarielcl at Gmail.

Name		Name	Last commit message	Last commit date
Latest commit History 31 Commits
data		data
executables		executables
experiments		experiments
external		external
includes		includes
src		src
.gitignore		.gitignore
CMakeLists.txt		CMakeLists.txt
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SafePathsRNAPC

C++ algorithm

Jupyter Notebooks

Contact

About

Releases

Packages

Languages

License

algbio/SafePathsRNAPC

Folders and files

Latest commit

History

Repository files navigation

SafePathsRNAPC

C++ algorithm

Jupyter Notebooks

Contact

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages