Skip to contents

Detect spatially variable genes using SPARK-X, a non-parametric method that tests for spatial expression patterns using multiple kernels.

SPARK-X is a scalable non-parametric method for identifying spatially variable genes. It uses variance component score tests with multiple spatial kernels (projection, Gaussian, and cosine) to detect various types of spatial expression patterns.

Usage

CalSVG_SPARKX(
  expr_matrix,
  spatial_coords,
  kernel_option = c("mixture", "single"),
  adjust_method = "BY",
  n_threads = 1L,
  verbose = TRUE
)

Arguments

expr_matrix

Numeric matrix of gene expression values.

  • Rows: genes

  • Columns: spatial locations (spots/cells)

  • Values: raw counts or normalized counts (NOT log-transformed)

Note: SPARK-X works best with count data, not log-transformed data.

spatial_coords

Numeric matrix of spatial coordinates.

  • Rows: spatial locations (must match columns of expr_matrix)

  • Columns: x, y coordinates

kernel_option

Character string specifying which kernels to use.

  • "mixture" (default): Test with all 11 kernels: 1 projection + 5 Gaussian + 5 cosine. Most comprehensive but slower. Recommended for detecting diverse spatial patterns.

  • "single": Test with projection kernel only. Faster but may miss some pattern types.

adjust_method

Character string for p-value adjustment. Default is "BY" (Benjamini-Yekutieli), which is more conservative and appropriate when tests may be correlated. Other options: "BH", "bonferroni", "holm", "none".

n_threads

Integer. Number of parallel threads. Default is 1. Higher values significantly speed up computation for large datasets.

verbose

Logical. Print progress messages. Default is TRUE.

Value

A data.frame with SVG detection results. Columns:

  • gene: Gene identifier

  • p.value: Combined p-value across all kernels (ACAT method)

  • p.adj: Multiple testing adjusted p-value

  • If kernel_option = "mixture", additional columns for individual kernel statistics and p-values (stat_*, pval_*)

Details

Method Overview:

SPARK-X uses a variance component score test framework: $$T_g = \frac{n \cdot y_g^T K y_g}{\|y_g\|^2}$$

where:

  • y_g = expression vector for gene g

  • K = spatial kernel matrix (derived from coordinates)

  • n = number of spatial locations

Kernel Types:

  • Projection kernel: Linear kernel based on scaled coordinates. Detects gradients and linear spatial trends.

  • Gaussian kernels: Multiple bandwidth Gaussian RBF kernels. Detect localized hotspots of different sizes.

  • Cosine kernels: Multiple frequency periodic kernels. Detect periodic/oscillating spatial patterns.

P-value Computation:

  • Individual kernel p-values: Davies' method for quadratic forms

  • Combined p-value: ACAT (Aggregated Cauchy Association Test)

Advantages:

  • Non-parametric: No distributional assumptions

  • Scalable: O(n) complexity, handles millions of cells

  • Multiple kernels: Detects diverse pattern types

  • Robust: ACAT combination handles correlated tests

Computational Considerations:

  • mixture option: ~11x slower than single

  • Memory: O(n) per gene, efficient for large datasets

  • Parallelization provides near-linear speedup

References

Zhu, J., Sun, S., & Zhou, X. (2021). SPARK-X: non-parametric modeling enables scalable and robust detection of spatial expression patterns for large spatial transcriptomic studies. Genome Biology.

See also

Examples

# Load example data
data(example_svg_data)
expr <- example_svg_data$counts[1:20, ]  # Use counts (not log)
coords <- example_svg_data$spatial_coords

# Fast mode with single kernel (no extra dependencies)
results <- CalSVG_SPARKX(expr, coords, 
                         kernel_option = "single",
                         verbose = FALSE)
head(results)
#>      gene       p.value         p.adj stat_linear   pval_linear
#> 1  gene_4 2.326245e-170 1.673845e-168   207.02459 2.326245e-170
#> 2 gene_12 6.624465e-154 2.383310e-152   283.48558 6.624465e-154
#> 3  gene_5 4.116361e-153 9.873063e-152   262.55227 4.116361e-153
#> 4 gene_10 1.820755e-149 3.275302e-148   183.99255 1.820755e-149
#> 5  gene_7 5.268820e-105 7.582337e-104   294.45147 5.268820e-105
#> 6 gene_14  1.151904e-51  1.381416e-50    88.57613  1.151904e-51