Getting Started with CellODE
Zaoqu Liu
2026-01-26
Source:vignettes/CellODE-quickstart.Rmd
CellODE-quickstart.RmdIntroduction
CellODE is an R package for inferring cellular dynamics from single-cell RNA sequencing data using deep generative models. The package combines: - Variational Autoencoders (VAE) for dimensionality reduction - Neural Ordinary Differential Equations (Neural ODE) for continuous dynamics modeling
Key capabilities include:
- Pseudotime estimation without specifying root cells
- Vector field inference representing cellular velocity
- Latent space learning capturing cellular states
- Prediction on query datasets
Installation
From R-universe (Recommended)
install.packages("CellODE", repos = "https://zaoqu-liu.r-universe.dev")From GitHub
# Install dependencies
install.packages(c("torch", "R6", "Matrix", "ggplot2", "Rcpp", "RcppArmadillo"))
# Install torch backend (run once)
torch::install_torch()
# Install CellODE
remotes::install_github("Zaoqu-Liu/CellODE")Basic Workflow
2. Prepare Data
CellODE works with Seurat objects. Your data should have: - Raw UMI
counts in counts slot (for
nb/zinb modes) - Or log-normalized data in
data slot (for mse mode)
# Load your Seurat object
seurat_obj <- readRDS("your_data.rds")
# Standard preprocessing (if not already done)
seurat_obj <- NormalizeData(seurat_obj)
seurat_obj <- FindVariableFeatures(seurat_obj, nfeatures = 2000)
seurat_obj <- ScaleData(seurat_obj)
seurat_obj <- RunPCA(seurat_obj)
seurat_obj <- RunUMAP(seurat_obj, dims = 1:30)3. Create and Train Model
# Create trainer
trainer <- Trainer$new(
seurat_obj = seurat_obj,
n_latent = 5, # Latent space dimensions
n_ode_hidden = 25, # ODE network hidden units
n_vae_hidden = 128, # VAE hidden units
loss_mode = "nb", # Negative binomial for UMI counts
nepoch = 100, # Training epochs
batch_size = 1024 # Batch size
)
# Train the model
trainer$train()
# Plot training history
plot_training_history(trainer)4. Extract Pseudotime
# Get pseudotime
pseudotime <- trainer$get_time()
# Add to Seurat object
seurat_obj$pseudotime <- pseudotime
# Visualize
plot_pseudotime(seurat_obj, time_key = "pseudotime", embedding_key = "umap")5. Get Latent Space
# Extract latent representations
latent <- trainer$get_latentsp(alpha_z = 0.5, alpha_predz = 0.5)
# Store in Seurat object
seurat_obj@misc$X_zs <- latent$mix_zs6. Compute Vector Field
# Get vector field
vf <- trainer$get_vector_field(pseudotime, latent$mix_zs)
seurat_obj@misc$X_VF <- vf
# Visualize
plot_vector_field(seurat_obj, zs_key = "X_zs", vf_key = "X_VF")Quick Example with Synthetic Data
Here’s a minimal working example:
library(CellODE)
library(torch)
# Create synthetic data
set.seed(42)
n_cells <- 200
n_genes <- 50
# Simulate expression with pseudo-trajectory
true_time <- runif(n_cells)
X <- matrix(0, n_cells, n_genes)
for (i in 1:n_cells) {
X[i, 1:25] <- rpois(25, lambda = 20 * (1 - true_time[i]) + 2)
X[i, 26:50] <- rpois(25, lambda = 20 * true_time[i] + 2)
}
# Create model directly (without Seurat)
model <- TNODE(
n_int = n_genes,
n_latent = 5L,
n_ode_hidden = 25L,
n_vae_hidden = 64L,
loss_mode = "nb"
)
# Prepare tensors
X_log <- log1p(X)
X_tensor <- torch::torch_tensor(X_log, dtype = torch::torch_float())
y_tensor <- X_tensor$sum(dim = 2)
# Train
optimizer <- torch::optim_adam(model$parameters, lr = 0.001)
for (epoch in 1:50) {
optimizer$zero_grad()
result <- model(X_tensor, y_tensor)
result$loss$backward()
optimizer$step()
if (epoch %% 10 == 0) {
cat(sprintf("Epoch %d: loss = %.4f\n", epoch, result$loss$item()))
}
}
# Get pseudotime
model$eval()
torch::with_no_grad({
enc_out <- model$encoder(X_tensor)
pseudotime <- as.numeric(enc_out$t$squeeze(-1)$cpu())
})
# Check correlation with true time
cat(sprintf("Correlation: %.4f\n", abs(cor(pseudotime, true_time))))Parameter Guidelines
Next Steps
- Algorithm Theory: Learn about the mathematical foundations in the Algorithm Theory vignette
-
Vector Field Analysis: Deep dive into velocity
analysis in the Vector Field
Analysis vignette
- Advanced Usage: Explore hyperparameter tuning in the Advanced Usage vignette