-
Notifications
You must be signed in to change notification settings - Fork 19
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Lots of changes. Added clinspacy_init() and changed how the package l…
…oads. Uses miniconda by default but the user can configure.
- Loading branch information
Singh
authored and
Singh
committed
Aug 21, 2020
1 parent
a873190
commit f73fa35
Showing
13 changed files
with
774 additions
and
139 deletions.
There are no files selected for viewing
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,41 +1,50 @@ | ||
#' Cui2vec concept embeddings | ||
#' | ||
#' This dataset contains sample medical transcriptions for various medical specialties. | ||
#' This dataset contains Unified Medical Langauge System (UMLS) concept embeddings from | ||
#' Andrew Beam's \href{https://github.com/beamandrew/cui2vec}{cui2vec R package}. There are | ||
#' 500 embeddings included for each concept. | ||
#' | ||
#' Acknowledgements | ||
#' Citation | ||
#' | ||
#' This data was scraped from mtsamples.com by Tara Boyle and is made available | ||
#' under a CC0: Public Domain license. | ||
#' Beam, A.L., Kompa, B., Schmaltz, A., Fried, I., Griffin, W, Palmer, N.P., Shi, X., | ||
#' Cai, T., and Kohane, I.S.,, 2019. Clinical Concept Embeddings Learned from Massive | ||
#' Sources of Multimodal Medical Data. arXiv preprint arXiv:1804.01486. | ||
#' | ||
#' @format A data frame with 4999 rows and 6 variables: | ||
#' License | ||
#' | ||
#' This data is made available under a | ||
#' \href{https://creativecommons.org/licenses/by/4.0/}{CC BY 4.0 license}. The only change | ||
#' made to the original dataset is the renaming of columns. | ||
#' | ||
#' @format A data frame with 109053 rows and 501 variables: | ||
#' \describe{ | ||
#' \item{note_id}{A unique identifier for each note} | ||
#' \item{description}{A description or chief concern} | ||
#' \item{medical_specialty}{Medical specialty of the note} | ||
#' \item{sample_name}{mtsamples.com note name} | ||
#' \item{transcription}{Transcription of note text} | ||
#' \item{keywords}{Keywords} | ||
#' \item{cui}{A Unified Medical Language System (UMLS) Concept Unique Identifier (CUI)} | ||
#' \item{emb_001}{Concept embedding vector #1} | ||
#' \item{emb_002}{Concept embedding vector #2} | ||
#' \item{...}{...} | ||
#' \item{emb_500}{Concept embedding vector #500} | ||
#' } | ||
#' @source \url{https://www.kaggle.com/tboyle10/medicaltranscriptions/data} | ||
#' @source \url{https://figshare.com/s/00d69861786cd0156d81} | ||
'cui2vec_embeddings' | ||
|
||
#' Cui2vec concept definitions | ||
#' | ||
#' This dataset contains sample medical transcriptions for various medical specialties. | ||
#' This dataset contains definitions for the Unified Medical Language System (UMLS) | ||
#' Concept Unique Identifiers (CUIs). These come from Andrew Beam's | ||
#' \href{https://github.com/beamandrew/cui2vec}{cui2vec R package}. | ||
#' | ||
#' Acknowledgements | ||
#' License | ||
#' | ||
#' This data was scraped from mtsamples.com by Tara Boyle and is made available | ||
#' under a CC0: Public Domain license. | ||
#' This data is made available under a | ||
#' \href{https://github.com/beamandrew/cui2vec/blob/master/LICENSE.md}{MIT license}. The data | ||
#' is copyrighted in 2019 by Benjamin Kompa, Andrew Beam, and Allen Schmaltz. The only change | ||
#' made to the original dataset is the renaming of columns. | ||
#' | ||
#' @format A data frame with 4999 rows and 6 variables: | ||
#' @format A data frame with 3053795 rows and 3 variables: | ||
#' \describe{ | ||
#' \item{note_id}{A unique identifier for each note} | ||
#' \item{description}{A description or chief concern} | ||
#' \item{medical_specialty}{Medical specialty of the note} | ||
#' \item{sample_name}{mtsamples.com note name} | ||
#' \item{transcription}{Transcription of note text} | ||
#' \item{keywords}{Keywords} | ||
#' \item{cui}{A Unified Medical Language System (UMLS) Concept Unique Identifier (CUI)} | ||
#' \item{semantic_type}{Semantic type of the CUI} | ||
#' \item{definition}{Definition of the CUI} | ||
#' } | ||
#' @source \url{https://www.kaggle.com/tboyle10/medicaltranscriptions/data} | ||
#' @source \url{https://github.com/beamandrew/cui2vec} | ||
'cui2vec_definitions' |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.