Skip to content

Commit

Permalink
CIVIC, NCI Thesaurus, MitelmanDB and Depmap update
Browse files Browse the repository at this point in the history
  • Loading branch information
sigven committed Aug 7, 2024
1 parent a06cc33 commit b94f2e7
Show file tree
Hide file tree
Showing 12 changed files with 64 additions and 23 deletions.
2 changes: 1 addition & 1 deletion DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@ Package: pharmOncoX
Type: Package
Title: Molecularly targeted cancer drugs and biomarkers
Version: 1.6.10
URL: https://sigven.github.io/pharmOncoX
Authors@R:
c(person(given = "Sigve",
family = "Nakken",
Expand All @@ -19,7 +20,6 @@ Description: This data package collects anticancer drug information from
also allows for the retrieval of curated biomarkers from multiple
freely available resources (CIViC, CGI, Mitelman database).
License: MIT + file LICENSE
URL: https://github.com/sigven/pharmOncoX
BugReports: https://github.com/sigven/pharmOncoX/issues
Depends:
R (>= 4.1.0)
Expand Down
19 changes: 13 additions & 6 deletions NEWS.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,10 @@
# Version 1.7.0

* CIViC update (20240807)
* NCI Thesaurus update (24.07e)
* MitelmanDB update (20240715)
* New dataset: DepMap (cell line) RNA fusion events

# Version 1.6.10

* Fixed some erroneous drug classifications
Expand All @@ -6,24 +13,24 @@

# Version 1.6.8 (June 7th 2024)

* NCI update (24.05d)
* NCI Thesaurus update (24.05d)

# Version 1.6.7 (May 23rd 2024)

* NCI update (24.04e)
* Updated CIViC (20240523)
* NCI Thesaurus update (24.04e)
* CIViC update (20240523)

# Version 1.6.4 (April 30th 2024)

* Improved clinical (tumor site) annotations of fusions from MitelmanDB

# Version 1.6.3 (April 26th 2024)

* Updated CIViC (20240426)
* CIViC update (20240426)

# Version 1.6.2 (April 12th 2024)

* Updated NCI Thesaurus (24.03d)
* NCI Thesaurus update (24.03d)

# Version 1.6.1 (March 26th 2024)

Expand All @@ -39,7 +46,7 @@

# Version 1.5.8 (February 6th 2024)

* NCI Thesaurus 24.01e
* NCI Thesaurus update (24.01e)

# Version 1.5.7 (February 3rd 2024)

Expand Down
Binary file modified R/sysdata.rda
Binary file not shown.
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

# pharmOncoX <a href="https://sigven.github.io/pharmOncoX/"><img src="man/figures/logo.png" align="right" height="104" width="90"/></a>

**pharmOncoX** is an R package that provides access to targeted and non-targeted cancer drugs, and genomic cancer biomarkers. Cancer drugs include comprehensive annotations per target, drug mechanism-of-action, approval dates, clinical trial phases for various indications etc. Drugs are further classified according to the [Anatomical Therapeutic Chemical (ATC) Classification System](https://www.whocc.no/atc_ddd_index/), enabling a filtering of cancer drugs according to their main types of action.
**pharmOncoX** is an R package that provides access to targeted and non-targeted cancer drugs, and genomic cancer biomarkers. Cancer drugs include comprehensive annotations per target, drug mechanism-of-action, approval dates, clinical trial phases for various indications etc. Drugs are further classified according to the [Anatomical Therapeutic Chemical (ATC) Classification System](https://www.whocc.no/atc_ddd_index/), enabling a filtering of cancer drugs according to their main types of action. The package also provides access to data on actionable genomic aberrations (i.e. molecular biomarkers), including gene fusions, mutations, copy number alterations, and expression biomarkers.


## Getting started
Expand Down
24 changes: 21 additions & 3 deletions data-raw/biomarker_utilities.R
Original file line number Diff line number Diff line change
Expand Up @@ -430,7 +430,7 @@ expand_hgvs_terms <- function(var, aa_dict, add_codon_markers = FALSE) {
}

load_civic_biomarkers <- function(
datestamp = '20240130',
datestamp = '20240709',
compound_synonyms = NULL,
hg38_fasta =
"/Users/sigven/research/DB/hg38/hg38.fa",
Expand Down Expand Up @@ -2630,8 +2630,26 @@ load_custom_fusion_db <- function() {
return(biomarker_items)
}

load_depmap_fusions <- function(db_datestamp = "24Q2"){

# Load DepMap fusions
depmap_data <- list()
depmap_data[['fusions']] <- as.data.frame(read.csv(
file = "data-raw/depmap/OmicsFusionFiltered.csv", header = T))

depmap_data[['models']] <- as.data.frame(read.csv(
file = "data-raw/depmap/Model.csv", header = T)) |>
dplyr::select(
ModelID, CellLineName, OncotreeLineage,
OncotreePrimaryDisease, OncotreeCode,
Age, Sex, PrimaryOrMetastasis, SampleCollectionSite,
SourceType
)
return(depmap_data)
}

load_mitelman_db <- function(cache_dir = NA,
db_datestamp = "20240415") {
db_datestamp = "20240715") {

# Load Mitelman database
# dos2unix -q -n MBCA.TXT.DATA MBCA.TXT
Expand Down Expand Up @@ -2672,7 +2690,7 @@ load_mitelman_db <- function(cache_dir = NA,

fusion_event_data <- as.data.frame(readr::read_tsv(
file = file.path(
cache_dir, "mitelmandb", "MBCA.TXT"),
cache_dir, "mitelmandb", "MBCA.TXT.DATA"),
show_col_types = F, guess_max = 100000)) |>
dplyr::filter(stringr::str_detect(GeneShort,"::")) |>
dplyr::rename(variant = GeneShort,
Expand Down
2 changes: 1 addition & 1 deletion data-raw/custom_drug_target_regex_nci.tsv
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
pattern symbol
OMX-0407 SIK3
RMC-9085 KRAS
RMC-9085|Olomorasib KRAS
3706674 KRAS
Rineterkib ERK1
Rineterkib ERK2
Expand Down
6 changes: 4 additions & 2 deletions data-raw/data-raw.R
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ opentargets_version <-
metadata$compounds[metadata$compounds$source_abbreviation == "opentargets",
"source_version"]
package_datestamp <- stringr::str_replace_all(Sys.Date(),"-","")
chembl_pubchem_datestamp <- '20240429'
chembl_pubchem_datestamp <- '20240708'

## set logging layout
lgr::lgr$appenders$console$set_layout(
Expand Down Expand Up @@ -74,7 +74,7 @@ drug_sets <- list()
## Get all anticancer drugs, NCI thesaurus + DGIdb
drug_sets[['nci']] <- get_nci_drugs(
nci_db_release = nci_db_release,
overwrite = F,
overwrite = T,
path_data_raw = path_data_raw,
path_data_processed = path_data_tmp_processed)

Expand Down Expand Up @@ -188,6 +188,8 @@ raw_biomarkers[['mitelmandb']] <-
cache_dir = file.path(path_data_raw, "biomarkers"))
raw_biomarkers[['custom_fusions']] <-
load_custom_fusion_db()
raw_biomarkers[['depmap']] <-
load_depmap_fusions()

raw_biomarkers[['custom_fusions']]$variant <-
raw_biomarkers[['custom_fusions']]$variant |>
Expand Down
2 changes: 2 additions & 0 deletions data-raw/drug_name_black_list.txt
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,8 @@ Butanilicaine Hydrochloride
Tesmilifene Hydrochloride
8H9 131I
ABC-294640
Neladenoson Bialanate
Sodium Caseinate
AXL-1717
AZD-7451
TAS-115
Expand Down
6 changes: 4 additions & 2 deletions data-raw/drug_utilities.R
Original file line number Diff line number Diff line change
Expand Up @@ -1750,8 +1750,10 @@ map_curated_targets <- function(gene_info = NULL,
hit$drug_approved_noncancer <- FALSE

## set general indications for unknown cases
if(is.na(hit$disease_efo_id) & is.na(hit$disease_efo_label) &
is.na(hit$cui) & is.na(hit$cui_name)){
if(is.na(hit$disease_efo_id) &
is.na(hit$disease_efo_label) &
is.na(hit$cui) &
is.na(hit$cui_name)){
hit$disease_efo_id = "EFO:0000311"
hit$disease_efo_label = "cancer"
hit$cui = "C0006826"
Expand Down
Binary file modified data-raw/metadata_pharm_oncox.xlsx
Binary file not shown.
10 changes: 6 additions & 4 deletions pkgdown/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,20 +2,22 @@

# pharmOncoX <a href="https://sigven.github.io/pharmOncoX/"><img src="man/figures/logo.png" align="right" height="104" width="90"/></a>

**pharmOncoX** provides access to targeted and non-targeted cancer drugs, including comprehensive annotations per target, drug mechanism-of-action, approval dates, clinical trial phases for various indications etc.
**pharmOncoX** provides access to targeted and non-targeted cancer drugs, including comprehensive annotations per target, drug mechanism-of-action, approval dates, clinical trial phases for various indications etc. It also provides access to data on actionable genomic aberrations (i.e. molecular biomarkers), including gene fusions, mutations, copy number alterations, and expression biomarkers.

The data is largely based on drug-target-indication associations provided by the [Open Targets Platform](https://targetvalidation.org) ([Ochoa et al., Nucleic Acids Res., 2021](https://doi.org/10.1093/nar/gkaa1027)). Associations retrieved from Open Targets Platform are integrated with cancer-relevant indications/conditions (as provided in [sigven/phenOncoX](https://github.com/sigven/phenOncoX)), allowing the user to retrieve drugs indicated for main tumor types (e.g. `Lung`, `Colon/Rectum` etc.)

Drug-target associations from the Open Targets Platform have furthermore been integrated and appended with drug information from [NCI Thesaurus](https://ncithesaurus.nci.nih.gov/ncitbrowser/), showing also non-targeted cancer drugs (chemotherapeutic agents etc.), and various drug regimens.

_pharmOncoX_ provides anti-cancer drug classification through existing entries in the [Anatomical Therapeutic Chemical (ATC) Classification System](https://www.whocc.no/atc_ddd_index/), and these have been extended significantly with manual curation, also by establishing novel drug categories that are presently missing in the ATC classificiation tree (examples include _AURK inhibitors_, _MET inhibitors_, _BET inhibitors_, _AKT inhibitors_, _PLK inhibitors_, _IAP inhibitors_, _RAS inhibitors_, _BCL2 inhibitors_ etc.) enabling a filtering of drugs according to their main mechanisms of action.

Currently (as of early June 2024), `pharmOncoX` is built upon the following
Currently (as of early August 2024), `pharmOncoX` is built upon the following
releases of external databases:

- Open Targets Platform (2024.03)
- Open Targets Platform (2024.06)
- ChEMBL (v34)
- NCI Thesaurus (24.05d)
- NCI Thesaurus (24.07e)
- MitelmanDB (20240715)
- CIViC (20240807)

### Getting started

Expand Down
14 changes: 11 additions & 3 deletions vignettes/pharmOncoX.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ knitr::opts_chunk$set(
```{r install, echo = T, eval = T}
if (!("remotes" %in% installed.packages())) {
install.packages("remotes")
install.packages("remotes")
}
remotes::install_github('sigven/pharmOncoX')
Expand Down Expand Up @@ -190,7 +190,11 @@ drugs <- drugs |>
dplyr::mutate(
disease_indication = stringr::str_replace_all(
disease_indication, "\\|",", ")
)
) |>
dplyr::select(
drug_id, drug_name, drug_type, molecule_chembl_id,
drug_action_type, target_symbol, dplyr::everything()
)
dt_drugtable_ras_inhibitors <- DT::datatable(
drugs,
Expand Down Expand Up @@ -472,7 +476,11 @@ drugs$records <- drugs$records |>
dplyr::mutate(
disease_indication = stringr::str_replace_all(
disease_indication, "\\|",", ")
)
) |>
dplyr::select(
drug_id, drug_name, drug_type, molecule_chembl_id,
drug_action_type, opentargets, dplyr::everything()
)
dt_drugtable_platins <- DT::datatable(
drugs$records,
Expand Down

0 comments on commit b94f2e7

Please sign in to comment.