π Documentation: https://zaoqu-liu.github.io/CellOracleR/
Overview
CellOracleR is a comprehensive R implementation of the CellOracle framework for in silico gene perturbation analysis in single-cell RNA sequencing data. This package enables systematic prediction of cell state transitions following transcription factor (TF) perturbations by integrating gene regulatory network (GRN) inference with single-cell trajectory analysis.
Scientific Background
Understanding how transcription factors regulate cell fate decisions is fundamental to developmental biology and regenerative medicine. CellOracleR leverages the mathematical framework of GRN-based signal propagation to simulate the transcriptomic consequences of TF knockouts or overexpression, enabling researchers to:
- Predict perturbation outcomes before conducting experiments
- Identify key regulators of cell fate transitions
- Dissect regulatory mechanisms underlying cellular differentiation
- Prioritize targets for functional validation studies
Methodological Framework
The CellOracleR workflow comprises four interconnected analytical modules:
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β CellOracleR Pipeline β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β ββββββββββββββββ ββββββββββββββββ ββββββββββββββββββββββββ β
β β Base GRN βββββΆβ GRN Fitting βββββΆβ Perturbation β β
β β Construction β β (Ridge Reg.) β β Simulation β β
β ββββββββββββββββ ββββββββββββββββ ββββββββββββββββββββββββ β
β β β β β
β βΌ βΌ βΌ β
β ββββββββββββββββ ββββββββββββββββ ββββββββββββββββββββββββ β
β β Motif β β Cluster- β β Transition β β
β β Scanning β β specific GRN β β Probability β β
β ββββββββββββββββ ββββββββββββββββ ββββββββββββββββββββββββ β
β β β
β βΌ β
β ββββββββββββββββββββββββ β
β β Cell Fate β β
β β Prediction β β
β ββββββββββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Installation
From R-universe (Recommended)
# Install from R-universe
install.packages("CellOracleR", repos = "https://zaoqu-liu.r-universe.dev")From GitHub
# Install development version from GitHub
if (!requireNamespace("remotes", quietly = TRUE))
install.packages("remotes")
remotes::install_github("Zaoqu-Liu/CellOracleR")System Requirements
- R β₯ 4.0.0
- C++ compiler with C++17 support
- Dependencies: Seurat (V4/V5), glmnet, igraph, Matrix, R6
For motif analysis functionality, Bioconductor packages are required:
if (!requireNamespace("BiocManager", quietly = TRUE))
install.packages("BiocManager")
BiocManager::install(c("TFBSTools", "motifmatchr", "JASPAR2020",
"BSgenome", "GenomicRanges"))Quick Start
Basic Workflow
library(CellOracleR)
library(Seurat)
# 1. Create Oracle object from Seurat
oracle <- create_oracle(
seurat_obj,
cluster_column = "cell_type",
embedding_name = "umap"
)
# 2. Import TF-target gene regulatory information
oracle$import_TF_data(TFdict = tf_target_dictionary)
# 3. Perform dimensionality reduction and imputation
oracle$perform_PCA(n_components = 50)
oracle$knn_imputation(k = 30)
# 4. Fit cluster-specific GRNs for simulation
oracle$fit_GRN_for_simulation(
GRN_unit = "cluster",
alpha = 10
)
# 5. Simulate TF knockout
oracle$simulate_shift(
perturb_condition = list(GATA1 = 0), # Knockout GATA1
n_propagation = 3
)
# 6. Estimate transition probabilities and visualize
oracle$estimate_transition_prob()
oracle$calculate_embedding_shift()
oracle$calculate_grid_arrows(n_grid = 40)
# 7. Visualize perturbation effects
plot_simulation_flow(oracle)Network Analysis
# Extract GRN as Links object for network analysis
links <- oracle$get_links(
alpha = 10,
bagging_number = 200
)
# Filter to significant regulatory edges
links$filter_links(p = 0.001, threshold_number = 2000)
# Compute network centrality metrics
links$get_network_score()
# Identify hub transcription factors
hubs <- identify_hubs(links, top_n = 20, method = "degree")
# Visualize regulatory network
plot_network_graph(links, cluster = "Progenitor")Key Features
𧬠GRN Inference
- Ridge regression with L2 regularization for robust coefficient estimation
- Bootstrap aggregation (bagging) for variance reduction
- Cluster-specific or whole-dataset GRN fitting
-
Parallel processing via the
futureframework
π Perturbation Simulation
- Signal propagation through regulatory networks
- Support for knockouts, overexpression, and partial perturbations
- Out-of-distribution detection and clipping
- Efficient C++ backend via RcppArmadillo
Seurat Compatibility
CellOracleR is designed for seamless integration with the Seurat ecosystem:
| Feature | Seurat V4 | Seurat V5 |
|---|---|---|
| Data import | β | β |
| Assay handling | β | β |
| Layer access | β | β |
| Reduction extraction | β | β |
| Metadata integration | β | β |
Performance
CellOracleR achieves high computational efficiency through:
- Vectorized R operations for data manipulation
- Rcpp/RcppArmadillo for performance-critical functions
- Sparse matrix support via the Matrix package
- Parallel computation using the future framework
Typical runtime for a dataset of 10,000 cells Γ 3,000 genes: - GRN fitting (200 bootstrap iterations): ~5-10 minutes - Perturbation simulation: ~30 seconds - Transition probability estimation: ~1 minute
Citation
If you use CellOracleR in your research, please cite both the original CellOracle paper and this R implementation:
Original CellOracle: > Kamimoto, K., Hoffmann, C.M., & Morris, S.A. (2023). CellOracle: Dissecting cell identity via network inference and in silico gene perturbation. Molecular Systems Biology, 19(5), e11547. https://doi.org/10.15252/msb.202211547
CellOracleR (R implementation): > Liu, Z. (2025). CellOracleR: An R implementation of CellOracle for in silico gene perturbation analysis. GitHub repository, https://github.com/Zaoqu-Liu/CellOracleR
Related Resources
- Original CellOracle (Python): https://github.com/morris-lab/CellOracle
- CellOracle documentation: https://morris-lab.github.io/CellOracle.documentation/
- Seurat: https://satijalab.org/seurat/
-
Tutorial vignette:
vignette("tutorial", package = "CellOracleR")
License
CellOracleR is released under the Apache License 2.0.
Contact
- Author: Zaoqu Liu
- Email: liuzaoqu@163.com
- GitHub: https://github.com/Zaoqu-Liu
- Issues: https://github.com/Zaoqu-Liu/CellOracleR/issues
Deciphering cell fate through computational perturbation
