Code for analyzing single-cell RNA-seq data from Li et al, Cell Stem Cell 2017
fastqdump_demux_kallisto.snakefile is a snakemake file that downloads and pseudomaps raw data from Li et al, Cell Stem Cell 2017. It writes a gene x cell matrix file for each run (corresponding to one dissected/sorted embryo) separately, however; a combined matrix is needed for further downstream analyses. It is provided at: http://pagelab.wi.mit.edu/page/papers/Nicholls_et_al_2017/alldat.noembryo.countmat.txt
define_clusters_de.Rmd, given the full genes x cells count matrix, performs filtering, normalization, and clustering of single cells using Seurat, and then uses these clusters to perform differential expression (using SCDE) between male and female germ cells pre- and post- entry into the genital ridge. It also compares the latter group (post-entry into the genital ridge but earlier in development) to late gonadal cells, again separately in males and females. In addition, it also calculates germ cell-specificity for all genes based on the estimates of cluster expression from SCDE. It also requires gencode.v24.annotation.basic_ccds_nopar.gene_tx_annotable.txt (if filtering for protein-coding genes is desired) and run_info.short.txt (in order to annotate the age of the embryos used for each scRNA-seq run) to be in the same directory from where it is run.