Updated new function to perform more fast and robust basal components identification through one command
list <- SharedBasalComponents(data_list,feature,reduct.dim = 30,W.top=2.5)
We applied the algorithm on pancreas islet datasets, compare with the raw ICAnet (first row), the newest version could capture more biological variation (second row)
Users could downloaded the newest version of ICAnet through
devtools::install_github("WWXKenmo/ICAnet")
Independent Component Analysis based gene co-expression Network inference (ICAnet) to decipher functional modules for better single cell clustering and batch integration
We introduced independent component analysis (ICA) into single cell clustering to decompose the gene expression matrix into a number of independent components. Each independent component was characterized by a co-expression pattern and was associated with certain meaningful biological pathway. Such concept enables ICAnet to identify shared gene co-expression module across different batches of datasets. Based on the idea that different batches of scRNA-seq datasets derived from the same cell type don’t exhibit exactly the same gene expression patterns but the key co-expression module usually tends to keep similar, ICAnet pairs the same sub-population of cells among different batches, regardless of their library type, sequencing platform or other influences. These features of ICAnet make it performs better in cell clustering and integrative analysis with different batches of scRNA-seq datasets.
For usage examples and guided walkthroughs, check the vignettes
directory of the repo.
Packages from Bioconductor: AUCell, RcisTarget, MineICA, STRINGdb, isva, clusterProfiler(required by RSCORE), genesorteR(required by RSCORE)
if (!requireNamespace("BiocManager", quietly = TRUE)) install.packages("BiocManager")
## Required
BiocManager::install(c("AUCell", "STRINGdb", "MineICA","RcisTarget","isva","clusterProfiler","remotes"))
BiocManager::install("mahmoudibrahim/genesorteR")
Packages from CRAN: Seurat, cluster, coop, fastICA, ica, igraph, isva, pheatmap, rARPACK, networkD3, doMC,propr(required by RSCORE), network(required by RSCORE), intergraph(required by RSCORE)
install.packages(c("Seurat", "cluster", "coop", "fastICA", "ica", "doParallel","igraph", "isva", "pheatmap", "rARPACK", "networkD3"))
install.packages("doMC", repos="http://R-Forge.R-project.org")
install.packages(c("propr", "network","intergraph"))
Packages from Github: RSCORE
if (!requireNamespace("devtools", quietly = TRUE)) install.packages("devtools")
devtools::install_github("wycwycpku/RSCORE")
- Install R (>= 3.6)
- Install Rstudio (recommended)
- Install all the packages required (see Dependency session)
- Download ICAnet.tar.gz
- Use the following R commands
install.packages("ICAnet.tar.gz",repos=NULL, type="source",INSTALL_opts=c("--no-multiarch"))
We note that this installation workflow is successfully implemented on the machine with fresh enviroments (R version 4.0.3, Ubuntu 18.04.5 LTS; R version 3.6.1, Red Hat Enterprise Linux). If you can't install our package, please raise the question in "issues" with no hesitate!
ICAnet required PPI network or cisTarget feather file as input.
The PPI network could be downloaded through getPPI_String/getPPI_Biogrid in RSCORE
library(RSCORE)
PPI.network_STRING <- getPPI_String(data, species=9606) #(PPI network of STRING Database)
PPI.network_BioGrid <- getPPI_Biogrid(data, species=9606) #(PPI network of BioGRID Database)
In which the 9606 is the NCBI taxon-Id for Homo sapiens, for the taxon-id of other species, see https://string-db.org/cgi/input.pl?input_page_active_form=organisms . Meanwhile, the data could be Seurat object or matrix with gene symbol as row names.
ICAnetTF required TF-motif binding information and motif gene annotation matrix (.feather), both provided by RcisTarget, the user could download feather file from https://resources.aertslab.org/cistarget/ with following commond
wget https://resources.aertslab.org/cistarget/databases/mus_musculus/mm9/refseq_r45/mc8nr/gene_based/mm9-500bp-upstream-10species.mc8nr.feather
The normalized single cell cell line dataset could be downloaded from https://github.com/WWXkenmo/MouseGerm/blob/master/cell_line_exp.RDS, also, the annotation of this dataset could be downloaded from https://github.com/WWXkenmo/MouseGerm/blob/master/ID_for_cell_line.txt . Meanwhile, the PPI used to perform ICAnet could be downloaded from https://github.com/WWXkenmo/ICAnet_external_data/blob/master/PPI_for_cell_line.RDS
Three pancreas dataset could be downloaded from https://hemberg-lab.github.io/scRNA.seq.datasets/ Including baron-human.rds, muraro.rds and segerstolpe.rds
The mouse brain gene expression dataset could be downloaded from GEO:GSE60361, and the motif annotation dataset (feather file) could be downloaded from https://resources.aertslab.org/cistarget/, meanwhile, the cell annotation dataset could be downloaded from https://github.com/WWXkenmo/ICAnet_external_data/blob/master/mouse_brain_cell_annotation.csv