Skip to contents

Projects query data onto reference gene expression programs (spectra) using non-negative least squares (NNLS). For each cell, solves the optimization:

$$\min_{w_i \geq 0} ||x_i - H^T w_i||_2^2$$

where \(x_i\) is the expression vector for cell i, \(H\) is the reference spectra matrix (programs x genes), and \(w_i\) is the usage vector.

Usage

fit_usage(
  X,
  H,
  method = c("cd", "active_set"),
  max_iter = 1000L,
  tol = 1e-08,
  n_workers = 1L,
  verbose = TRUE
)

Arguments

X

Query matrix (cells x genes), should be pre-processed

H

Reference spectra matrix (programs x genes)

method

Solver method: "cd" for coordinate descent (default), "active_set" for Lawson-Hanson active set algorithm

max_iter

Maximum iterations per cell (default: 1000)

tol

Convergence tolerance (default: 1e-8)

n_workers

Number of parallel workers for batch processing (default: 1)

verbose

Print progress messages (default: TRUE)

Value

Usage matrix W (cells x programs)

Details

The coordinate descent method is generally faster for this problem size, while the active set method provides guaranteed convergence properties. Both methods produce mathematically equivalent results.