Skip to content

Commit

Permalink
Merge pull request #36 from puja-trivedi/add_cli_geneannotation_20240724
Browse files Browse the repository at this point in the history
Add CLI to genome_annotation_translator.py
  • Loading branch information
puja-trivedi authored Oct 3, 2024
2 parents 7280c89 + fec83ea commit efc5ee9
Show file tree
Hide file tree
Showing 11 changed files with 857 additions and 184 deletions.
1 change: 1 addition & 0 deletions .gitattributes
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
*.json filter=lfs diff=lfs merge=lfs -text
37 changes: 37 additions & 0 deletions .github/workflows/add_dunder_methods.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
name: add dunder methods to genome_annotation model

on:
push:
paths:
- 'bkbit/models/genome_annotation.py'

permissions:
contents: write

jobs:
run-script:
runs-on: ubuntu-latest
steps:
- name: Checkout this repository
uses: actions/checkout@v3

- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: 3.9

- name: Install dependencies
run: |
python -m pip install --upgrade pip
- name: Run add_dunderMethods_genomeAnnotation
run: python bkbit/model_editors/add_dunderMethods_genomeAnnotation.py

- name: Commit changes
run: |
git config --global user.name 'github-actions'
git config --global user.email '[email protected]'
git add bkbit/models/genome_annotation.py
git commit -m 'Update genome_annotation.py with dunder methods'
git push
4 changes: 4 additions & 0 deletions bkbit/cli.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,8 @@
from bkbit.data_translators.library_generation_translator import specimen2jsonld
from bkbit.model_converters.yaml2sheet_converter import yaml2cvs
from bkbit.data_translators.file_manifest_translator import filemanifest2jsonld
from bkbit.data_translators.genome_annotation_translator import gff2jsonld
from bkbit.utils.get_ncbi_taxonomy import download_ncbi_taxonomy

@click.group()
def cli():
Expand All @@ -14,6 +16,8 @@ def cli():
cli.add_command(specimen2jsonld)
cli.add_command(yaml2cvs)
cli.add_command(filemanifest2jsonld)
cli.add_command(gff2jsonld)
cli.add_command(download_ncbi_taxonomy)

if __name__ == '__main__':
cli()
107 changes: 107 additions & 0 deletions bkbit/data_translators/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -95,4 +95,111 @@ ls .
DO-XIQQ6047.jsonld
DO-WFFF3774.jsonld
DO-RMRL6873.jsonld
# genome_annotation_translator.py

## Overview
genome_annotation_translator uses annotated genome data in GFF3 format to generate respective data objects representing genes, genome assemblies, and organisms. All data object are defined in the [Genome Annotation Schema](https://brain-bican.github.io/models/index_genome_annotation/).<br>
Each jsonld file will contain:
- GeneAnnotation objects
- 1 GenomeAnnotation object
- 1 GenomeAssembly object
- 1 OrganismTaxon object
- 1 Checksum object



## Command Line
### gen-geneannotation
```python
gen-geneannotation [OPTIONS] GFF3_URL
```

#### Options
<span style="color: red;">-a, --assembly_accession</span> <br>
&emsp;ID assigned to the genomic assembly used in the GFF3 file. <br>
&emsp;<b>*Note*</b>: Must be provided when using ENSEMBL GFF3 files

<span style="color: red;">-s, --assembly_strain</span> <br>
&emsp;Specific strain of the organism associated with the GFF3 file.

<span style="color: red;">-l, --log_level</span> <br>
&emsp;Logging level. <br>
&emsp;DEFAULT:<br>
&emsp;&emsp;'WARNING'<br>
&emsp;OPTIONS:<br>
&emsp;&emsp;DEBUG | INFO | WARNING | ERROR | CRITICIAL

<span style="color: red;">-f, --log_to_file</span> <br>
&emsp;Log to a file instead of the console. <br>
&emsp;DEFAULT:<br>
&emsp;&emsp;False <br>

## Examples
#### Example 1: NCBI GFF3 File

```python
pip install bkbit

gen-geneannotation 'https://ftp.ncbi.nlm.nih.gov/genomes/all/annotation_releases/9823/106/GCF_000003025.6_Sscrofa11.1/GCF_000003025.6_Sscrofa11.1_genomic.gff.gz' > output.jsonld
```

#### Example 2: ENSEMBL GFF3 File

```python
pip install bkbit

# genome_annotation_translator.py

## Overview
genome_annotation_translator uses annotated genome data in GFF3 format to generate respective data objects representing genes, genome assemblies, and organisms. All data object are defined in the [Genome Annotation Schema](https://brain-bican.github.io/models/index_genome_annotation/).<br>
Each jsonld file will contain:
- GeneAnnotation objects
- 1 GenomeAnnotation object
- 1 GenomeAssembly object
- 1 OrganismTaxon object
- 1 Checksum object



## Command Line
### gen-geneannotation
```python
gen-geneannotation [OPTIONS] GFF3_URL
```

#### Options
<span style="color: red;">-a, --assembly_accession</span> <br>
&emsp;ID assigned to the genomic assembly used in the GFF3 file. <br>
&emsp;<b>*Note*</b>: Must be provided when using ENSEMBL GFF3 files

<span style="color: red;">-s, --assembly_strain</span> <br>
&emsp;Specific strain of the organism associated with the GFF3 file.

<span style="color: red;">-l, --log_level</span> <br>
&emsp;Logging level. <br>
&emsp;DEFAULT:<br>
&emsp;&emsp;'WARNING'<br>
&emsp;OPTIONS:<br>
&emsp;&emsp;DEBUG | INFO | WARNING | ERROR | CRITICIAL

<span style="color: red;">-f, --log_to_file</span> <br>
&emsp;Log to a file instead of the console. <br>
&emsp;DEFAULT:<br>
&emsp;&emsp;False <br>

## Examples
#### Example 1: NCBI GFF3 File

```python
pip install bkbit

gen-geneannotation 'https://ftp.ncbi.nlm.nih.gov/genomes/all/annotation_releases/9823/106/GCF_000003025.6_Sscrofa11.1/GCF_000003025.6_Sscrofa11.1_genomic.gff.gz' > output.jsonld
```

#### Example 2: ENSEMBL GFF3 File

```python
pip install bkbit

gen-geneannotation -a 'GCF_003339765.1' 'https://ftp.ensembl.org/pub/release-104/gff3/macaca_mulatta/Macaca_mulatta.Mmul_10.104.gff3.gz' > output.jsonld
```
Loading

0 comments on commit efc5ee9

Please sign in to comment.