Run Complete TorchDecon Workflow — RunTorchDecon • TorchDecon

A convenience function that runs the complete TorchDecon workflow: simulate bulk data, process, train ensemble, and optionally predict.

Usage

RunTorchDecon(
  seurat_object,
  bulk_data = NULL,
  celltype_col = NULL,
  assay = NULL,
  n_samples = 1000L,
  cells_per_sample = 100L,
  sparse_fraction = 0.5,
  unknown_celltypes = NULL,
  num_steps = 1000L,
  batch_size = 128L,
  learning_rate = 1e-04,
  validation_split = 0,
  early_stopping = FALSE,
  patience = 100L,
  var_cutoff = 0.1,
  scaling = "log_min_max",
  model_type = c("ensemble", "single"),
  architecture = c("m512", "m256", "m1024"),
  device = "auto",
  save_model = NULL,
  seed = 42L,
  verbose = TRUE,
  n_cores = 1L
)

Arguments

seurat_object: A Seurat object with cell type annotations.
bulk_data: Matrix of bulk RNA-seq data for prediction (genes x samples). If NULL, only training is performed.
celltype_col: Character. Metadata column with cell type labels.
assay: Character. Assay to use from Seurat object. Default is NULL (default assay).
n_samples: Integer. Number of bulk samples to simulate. Default is 1000.
cells_per_sample: Integer. Cells per simulated sample. Default is 100.
sparse_fraction: Numeric. Fraction of sparse samples (0-1). Default is 0.5.
unknown_celltypes: Character vector. Cell types to merge into "Unknown". Default is NULL.
num_steps: Integer. Training steps per model. Default is 1000 (matches Python).
batch_size: Integer. Training batch size. Default is 128.
learning_rate: Numeric. Learning rate. Default is 0.0001.
validation_split: Numeric. Fraction for validation (0-1). Default is 0.
early_stopping: Logical. Enable early stopping. Default is FALSE.
patience: Integer. Early stopping patience. Default is 100.
var_cutoff: Numeric. Variance cutoff for gene filtering. Default is 0.1.
scaling: Character. Scaling method: "log_min_max", "log_zscore", or "none". Default is "log_min_max".
model_type: Character. "ensemble" or "single". Default is "ensemble".
architecture: Character. Architecture for single model: "m256", "m512", "m1024". Default is "m512".
device: Character. "auto", "cpu", or "cuda". Default is "auto".
save_model: Character. Path to save trained model. Default is NULL (don't save).
seed: Integer. Random seed. Default is 42.
verbose: Logical. Print progress. Default is TRUE.
n_cores: Integer. Cores for parallel simulation. Default is 1.

Value

A list containing:

model: The trained TorchDeconModel or TorchDeconEnsemble
predictions: Predicted cell fractions (if bulk_data provided)
simulation: The simulation object
processed: The processed training data

Examples

if (FALSE) { # \dontrun{
# Complete workflow with ensemble (default)
result <- RunTorchDecon(
  seurat_object = my_seurat,
  bulk_data = bulk_expression,
  celltype_col = "cell_type",
  n_samples = 2000,
  num_steps = 1000
)

# Single model with early stopping
result <- RunTorchDecon(
  seurat_object = my_seurat,
  bulk_data = bulk_expression,
  model_type = "single",
  architecture = "m1024",
  validation_split = 0.1,
  early_stopping = TRUE
)

# Get predictions
predictions <- result$predictions
} # }