scClustEval: Single Cell Clustering Evaluation and Optimization Framework
Source:R/scClustEval-package.R
scClustEval-package.RdA comprehensive framework for evaluating and optimizing single-cell RNA-seq clustering results using self-projection machine learning approaches.
Details
The scClustEval package provides tools for:
Clustering Assessment: Evaluate the quality of cell clustering using self-projection with various machine learning classifiers
Clustering Optimization: Iteratively merge poorly discriminated clusters to achieve robust cell type identification
Visualization: ROC curves, confusion matrices, Sankey diagrams, and comprehensive assessment plots
Seurat Integration: Seamless workflow with Seurat objects
The core algorithm works by:
Training a classifier to distinguish between clusters
Evaluating prediction accuracy via cross-validation and hold-out testing
Identifying cluster pairs that are difficult to discriminate
Merging confused clusters and iterating until target accuracy is reached
Main Functions
sc_assessmentCore function for clustering assessment
sc_optimizeSingle round of clustering optimization
sc_optimize_allFull iterative optimization pipeline
RunAssessmentSeurat-style assessment function
RunOptimizationSeurat-style optimization function
Classifiers
The package supports multiple classifiers:
LR: Logistic Regression (L1/L2 regularization)
RF: Random Forest
SVM: Support Vector Machine
NB: Naive Bayes
DT: Decision Tree
XGB: XGBoost (if installed)
References
This package is an R implementation inspired by the SCCAF Python package: https://github.com/SCCAF/sccaf
Miao, Z., et al. (2020). Putative cell type discovery from single-cell gene expression data. Nature Methods.
Author
Zaoqu Liu liuzaoqu@163.com