Skip to content

programmingprincess/tumor-origin

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

57 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Neural networks for detecting tumor origin

MircoRNA results available in analysis.ipynb, viewable here

DNAm results availabile in analysis-dnam.ipynb, viewable here

Workflow:

This project uses virtualenv to create isolated Python environments.

MicroRNA

  • Download isoforms from 17 different classes of cancer from TCGA
In R, on nano cluster
  • Put all samples of the same type into a matrix using rptashkin's TCGA_miRNASeq_Matrix (rows are features; columns are samples)
  • Merge matrices
  • Transpose
  • Randomize, split labels
In Python, on nano cluster
  • Select features based on low NA-values
  • Put all samples of the same type into a matrix using rptashkin's TCGA miRNASeq Matrix (rows are features; columns are samples)
  • Merge matrices
  • Transpose
Jupyter notebook
  • Test random forest, knn, and svm baselines
  • Visualize keras tuning data from cluster
  • Attempt cross validation

DNA Methylation

  • Download 27k Illumina samples from TCGA using TCGA2STAT
In R, on nano cluster
  • Get data from TCGA using tcga2stat.R
  • Select features based on low NA-values
  • Select for high variability (20-80 percentile)
  • Merge samples into one data matrix
  • Randomize, split labels
In Python, on nano cluster
  • Baseline models to guage accuracy before feature selection
  • Tune nnet hyperparameters
Jupyter Notebook
  • Visualize tuning data

Todo:

  • Does feature selection improve random forest model?
  • Does feature selection improve NNet model?
  • Scaling (0,1)
  • Try KNN, SVM, baselines
  • High variability feature selection
  • Process methylation data
  • Import additional metastatic datasets
  • Attempt on non-TCGA datasets

References

This work utilizes resources supported by the National Science Foundation's Major Research Instrumentation program, grant #1725729, as well as the University of Illinois at Urbana-Champaign

About

tumor origin detection using a deep neural network

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published