Skip to contents

Detect spatially variable genes using methods implemented in Seurat, including Moran's I with inverse distance weights and Mark Variogram.

Identifies spatially variable genes using Moran's I statistic with inverse distance squared weighting, as implemented in Seurat's FindSpatiallyVariableFeatures function.

Usage

CalSVG_Seurat(
  expr_matrix,
  spatial_coords,
  weight_scheme = c("inverse_squared", "inverse", "gaussian"),
  bandwidth = NULL,
  adjust_method = "BH",
  n_threads = 1L,
  verbose = TRUE
)

Arguments

expr_matrix

Numeric matrix of gene expression values.

  • Rows: genes

  • Columns: spatial locations (spots/cells)

  • Values: scaled/normalized expression (Seurat typically uses scale.data)

spatial_coords

Numeric matrix of spatial coordinates.

  • Rows: spatial locations (must match columns of expr_matrix)

  • Columns: x, y coordinates

weight_scheme

Character string specifying the distance-based weighting.

  • "inverse_squared" (default): w_ij = 1 / d_ij^2 (Seurat default, emphasizes local neighbors)

  • "inverse": w_ij = 1 / d_ij (less emphasis on close neighbors)

  • "gaussian": w_ij = exp(-d_ij^2 / (2 * bandwidth^2)) (controlled by bandwidth parameter)

bandwidth

Numeric. Bandwidth for Gaussian weighting. Default is NULL (auto-computed as median pairwise distance). Only used when weight_scheme = "gaussian".

adjust_method

Character string for p-value adjustment. Default is "BH" (Benjamini-Hochberg).

n_threads

Integer. Number of parallel threads. Default is 1.

verbose

Logical. Print progress messages. Default is TRUE.

Value

A data.frame with SVG detection results. Columns:

  • gene: Gene identifier

  • observed: Observed Moran's I statistic

  • expected: Expected Moran's I under null

  • sd: Standard deviation under null

  • p.value: Raw p-value

  • p.adj: Adjusted p-value

  • rank: Rank by p-value (ascending)

Details

Method Overview:

This function replicates Seurat's FindSpatiallyVariableFeatures with selection.method = "moransi". The key difference from other Moran's I implementations is the weighting scheme:

$$w_{ij} = \frac{1}{d_{ij}^2}$$

where d_ij is the Euclidean distance between locations i and j.

Interpretation:

  • Uses continuous distance-based weights (not binary network)

  • Emphasizes local spatial relationships

  • Higher weights for closer neighbors

Comparison with MERINGUE:

  • MERINGUE: Binary adjacency (neighbors = 1, others = 0)

  • Seurat: Continuous weights (1/distance^2)

  • Seurat method is more sensitive to local patterns

References

Hao, Y. et al. (2021) Integrated analysis of multimodal single-cell data. Cell.

Stuart, T. et al. (2019) Comprehensive Integration of Single-Cell Data. Cell.

Examples

# Load example data
data(example_svg_data)
expr <- example_svg_data$logcounts[1:20, ]
coords <- example_svg_data$spatial_coords

# \donttest{
# Basic usage
results <- CalSVG_Seurat(expr, coords, verbose = FALSE)
head(results)
#>      gene  observed     expected          sd p.value p.adj rank
#> 1  gene_7 0.6772728 -0.002004008 0.009605248       0     0    1
#> 2 gene_17 0.5345832 -0.002004008 0.009597744       0     0    2
#> 3 gene_19 0.5149447 -0.002004008 0.009594016       0     0    3
#> 4  gene_9 0.5090348 -0.002004008 0.009574655       0     0    4
#> 5 gene_12 0.4738920 -0.002004008 0.009579223       0     0    5
#> 6  gene_5 0.4564260 -0.002004008 0.009576880       0     0    6
# }