Detect spatially variable genes using nnSVG, a method based on nearest-neighbor Gaussian processes for scalable spatial modeling.
nnSVG uses nearest-neighbor Gaussian processes (NNGP) to model spatial correlation structure in gene expression. It performs likelihood ratio tests comparing spatial vs. non-spatial models to identify SVGs.
Arguments
- expr_matrix
Numeric matrix of gene expression values.
Rows: genes
Columns: spatial locations (spots/cells)
Values: log-normalized counts (e.g., from scran::logNormCounts)
- spatial_coords
Numeric matrix of spatial coordinates.
Rows: spatial locations (must match columns of expr_matrix)
Columns: x, y coordinates
- X
Optional numeric matrix of covariates to regress out.
Rows: spatial locations (same order as spatial_coords)
Columns: covariates (e.g., batch, cell type indicators)
Default is NULL (intercept-only model).
- n_neighbors
Integer. Number of nearest neighbors for NNGP model. Default is 10.
5-10: Faster, captures local patterns
15-20: Better likelihood estimates, slower
Values > 15 rarely improve results but increase computation time.
- order
Character string specifying coordinate ordering scheme.
"AMMD"(default): Approximate Maximum Minimum Distance. Better for most datasets. Requires >= 65 spots."Sum_coords": Order by sum of coordinates. Use for very small datasets (< 65 spots).
- cov_model
Character string specifying the covariance function. Default is "exponential".
"exponential": Most commonly used, computationally stable"gaussian": Smoother patterns, requires stabilization"spherical": Finite range correlation"matern": Flexible smoothness (includes additional nu parameter)
- adjust_method
Character string for p-value adjustment. Default is "BH" (Benjamini-Hochberg).
- n_threads
Integer. Number of parallel threads. Default is 1. Set to number of available cores for faster computation.
- verbose
Logical. Print progress messages. Default is FALSE.
Value
A data.frame with SVG detection results. Columns:
gene: Gene identifiersigma.sq: Spatial variance estimate (sigma^2)tau.sq: Nonspatial variance estimate (tau^2, nugget)phi: Range parameter estimate (controls spatial correlation decay)prop_sv: Proportion of spatial variance = sigma.sq / (sigma.sq + tau.sq)loglik: Log-likelihood of spatial modelloglik_lm: Log-likelihood of non-spatial model (linear model)LR_stat: Likelihood ratio test statistic = -2 * (loglik_lm - loglik)rank: Rank by LR statistic (1 = highest)p.value: P-value from chi-squared distribution (df = 2)p.adj: Adjusted p-valueruntime: Computation time per gene (seconds)
Details
Method Overview:
nnSVG models gene expression as a Gaussian process: $$y = X\beta + \omega + \epsilon$$
where:
y = expression vector
X = covariate matrix, beta = coefficients
omega ~ GP(0, sigma^2 * C(phi)) = spatial random effect
epsilon ~ N(0, tau^2) = non-spatial noise
C(phi) = covariance function with range phi
Nearest-Neighbor Approximation: Full GP has O(n^3) complexity. NNGP approximates using only k nearest neighbors, reducing complexity to O(n * k^3) = O(n).
Statistical Test: Likelihood ratio test comparing:
H0 (null): y = X*beta + epsilon (no spatial effect)
H1 (alternative): y = X*beta + omega + epsilon (with spatial effect)
LR statistic follows chi-squared with df = 2 (testing sigma.sq and phi).
Effect Size: Proportion of spatial variance (prop_sv) measures effect size:
prop_sv near 1: Strong spatial pattern
prop_sv near 0: Little spatial structure
Computational Notes:
Requires BRISC package for NNGP fitting
O(n) complexity per gene with NNGP approximation
Parallelization over genes provides good speedup
Memory: O(n * k) per gene
References
Weber, L.M. et al. (2023) nnSVG for the scalable identification of spatially variable genes using nearest-neighbor Gaussian processes. Nature Communications.
Datta, A. et al. (2016) Hierarchical Nearest-Neighbor Gaussian Process Models for Large Geostatistical Datasets. JASA.
See also
CalSVG, BRISC package documentation
Examples
# Load example data
data(example_svg_data)
expr <- example_svg_data$logcounts[1:10, ] # Small subset
coords <- example_svg_data$spatial_coords
# \donttest{
# Basic usage (requires BRISC package)
if (requireNamespace("BRISC", quietly = TRUE)) {
results <- CalSVG_nnSVG(expr, coords, verbose = FALSE)
head(results)
}
# }