📖 Documentation: https://zaoqu-liu.github.io/CytoSPACER/
Overview
CytoSPACER is an R implementation of CytoSPACE (Vahid et al., Nature Biotechnology, 2023), a computational framework for high-resolution alignment of single-cell transcriptomes to spatial transcriptomics (ST) data. The algorithm formulates cell-to-spot assignment as a linear assignment problem (LAP) and solves it by minimizing a correlation-based cost function using the Jonker-Volgenant algorithm.
This package provides a native R implementation with high-performance C++ backends via Rcpp, enabling seamless integration with the R/Bioconductor ecosystem and Seurat workflows.
Algorithm
CytoSPACER performs spatial mapping through the following steps:
-
Cell Type Deconvolution: Estimate cell type fractions per ST spot using reference-based deconvolution (via Seurat’s
TransferData) - Cell Count Estimation: Infer the number of cells per spot based on total RNA content
- Reference Sampling: Sample single cells from the scRNA-seq reference to match the estimated spatial composition
- Cost Matrix Construction: Compute pairwise dissimilarity between single cells and ST spots using Pearson correlation
- Optimal Assignment: Solve the LAP using the Jonker-Volgenant algorithm to find the globally optimal cell-to-spot mapping
The Jonker-Volgenant algorithm provides an efficient O(n³) solution with excellent practical performance for dense cost matrices.
Features
| Feature | Description |
|---|---|
| High Performance | C++ implementation of LAP solver and correlation computation |
| Cross-Platform | Native support for Windows, macOS, and Linux |
| Parallel Processing | Multi-core support via the future framework |
| Seurat Integration | Direct compatibility with Seurat v4/v5 objects |
| Flexible Input | Support for CSV, TSV, sparse MTX, and SpaceRanger output |
| Multiple Metrics | Pearson correlation, Spearman correlation, Euclidean distance |
Installation
From R-universe (Recommended)
install.packages("CytoSPACER", repos = "https://zaoqu-liu.r-universe.dev")From GitHub
# Install remotes if not available
if (!require("remotes")) install.packages("remotes")
# Install CytoSPACER
remotes::install_github("Zaoqu-Liu/CytoSPACER")Dependencies
Core dependencies (automatically installed): - Rcpp, data.table, Matrix, future, future.apply, progressr, ggplot2
Optional (for extended functionality): - Seurat (≥ 4.0.0) — Seurat object integration and cell type fraction estimation - viridis — Additional color palettes for visualization
Quick Start
Standard Workflow
library(CytoSPACER)
# Load input data
sc_expr <- read_cytospace_input("scRNA_counts.csv")
st_expr <- read_cytospace_input("ST_counts.csv")
coords <- read.csv("coordinates.csv", row.names = 1)
# Prepare cell type annotations
cell_labels <- read.csv("cell_types.csv", row.names = 1)
cell_types <- setNames(cell_labels$CellType, rownames(cell_labels))
# Run CytoSPACER
results <- run_cytospace(
sc_data = sc_expr,
cell_types = cell_types,
st_data = st_expr,
coordinates = coords,
mean_cells_per_spot = 5,
distance_metric = "pearson",
seed = 42
)
# Export results
write_cytospace_results(results, output_dir = "cytospace_output/")
# Visualize spatial distribution
plot_cytospace(results, type = "cell_types")Seurat Integration
library(CytoSPACER)
library(Seurat)
# Load Seurat objects
sc_seurat <- readRDS("scRNA_seurat.rds")
st_seurat <- readRDS("visium_seurat.rds")
# Run analysis directly from Seurat objects
results <- run_cytospace_seurat(
sc_seurat = sc_seurat,
st_seurat = st_seurat,
cell_type_col = "celltype"
)
# Add results to spatial Seurat object
st_seurat <- add_cytospace_to_seurat(st_seurat, results)
# Visualize with Seurat
SpatialDimPlot(st_seurat, group.by = "dominant_celltype_cytospace")Input Data Format
Advanced Usage
Distance Metrics
# Pearson correlation (default, recommended for most cases)
results <- run_cytospace(..., distance_metric = "pearson")
# Spearman correlation (robust to outliers)
results <- run_cytospace(..., distance_metric = "spearman")
# Euclidean distance
results <- run_cytospace(..., distance_metric = "euclidean")Sampling Strategies
# Duplicates method (default): reuse cells when reference is insufficient
results <- run_cytospace(..., sampling_method = "duplicates")
# Synthetic method: generate synthetic cells via gene-wise sampling
results <- run_cytospace(..., sampling_method = "synthetic")Single-Cell Spatial Data
For single-cell resolution platforms (MERFISH, seqFISH, Xenium, CosMx):
results <- run_cytospace(
...,
single_cell = TRUE,
st_cell_types = spatial_cell_labels # Optional prior cell type information
)Parallel Processing
# Automatic parallelization
results <- run_cytospace(..., n_workers = 8)
# Custom future plan
library(future)
plan(multisession, workers = 8)
results <- run_cytospace(...)Output
CytoSPACER generates the following outputs:
| File | Description |
|---|---|
assigned_locations.csv |
Cell-to-spot assignments with spatial coordinates |
cell_type_assignments_by_spot.csv |
Cell type counts per spot |
fractional_abundances_by_spot.csv |
Cell type proportions per spot |
assigned_expression/ |
Expression matrix for assigned cells |
log.txt |
Analysis log with parameters and runtime |
Visualization
# Spatial cell type distribution
plot_cytospace(results, type = "cell_types")
# Add jitter for overlapping points
plot_cytospace(results, type = "cell_types", jitter = 0.3)
# Cell counts per spot (faceted)
plot_cytospace(results, type = "by_spot", ncol = 4)
# Cell type composition
plot_composition(results, type = "global")
plot_composition(results, type = "per_spot", top_spots = 20)
# Save publication-quality figures
p <- plot_cytospace(results)
save_cytospace_plot(p, "figures/", formats = c("png", "pdf"), dpi = 300)Performance Considerations
-
Memory: For large datasets (>50,000 cells), use
chunk_sizeparameter to control memory usage -
Speed: Enable parallel processing with
n_workersfor datasets with >10,000 spots - Sparse data: CytoSPACER automatically handles sparse matrices efficiently
# Optimized for large datasets
results <- run_cytospace(
...,
chunk_size = 5000,
n_workers = parallel::detectCores() - 1,
downsample = TRUE,
downsample_target = 1500
)Citation
If you use CytoSPACER in your research, please cite:
Vahid MR, Brown EL, Steen CB, Zhang W, Jeon HS, Kang M, Buj R, Sahu A, Datta R, Afshari A, Newman AM. High-resolution alignment of single-cell and spatial transcriptomes with CytoSPACE. Nature Biotechnology 41, 1543–1548 (2023). https://doi.org/10.1038/s41587-023-01697-9
References
- CytoSPACE: Vahid et al. (2023). Nature Biotechnology. DOI: 10.1038/s41587-023-01697-9
- Jonker-Volgenant Algorithm: Jonker R, Volgenant A. (1987). A shortest augmenting path algorithm for dense and sparse linear assignment problems. Computing 38(4):325-340.
- Original Implementation: https://github.com/digitalcytometry/cytospace
License
This project is licensed under the MIT License. See LICENSE for details.
Contributing
Contributions are welcome! Please submit issues and pull requests on GitHub.
CytoSPACER: Bridging single-cell and spatial transcriptomics through optimal transport