Skip to contents

A convenience function that runs the complete TorchDecon workflow: simulate bulk data, process, train ensemble, and optionally predict.

Usage

RunTorchDecon(
  seurat_object,
  bulk_data = NULL,
  celltype_col = NULL,
  assay = NULL,
  n_samples = 1000L,
  cells_per_sample = 100L,
  sparse_fraction = 0.5,
  unknown_celltypes = NULL,
  num_steps = 1000L,
  batch_size = 128L,
  learning_rate = 1e-04,
  validation_split = 0,
  early_stopping = FALSE,
  patience = 100L,
  var_cutoff = 0.1,
  scaling = "log_min_max",
  model_type = c("ensemble", "single"),
  architecture = c("m512", "m256", "m1024"),
  device = "auto",
  save_model = NULL,
  seed = 42L,
  verbose = TRUE,
  n_cores = 1L
)

Arguments

seurat_object

A Seurat object with cell type annotations.

bulk_data

Matrix of bulk RNA-seq data for prediction (genes x samples). If NULL, only training is performed.

celltype_col

Character. Metadata column with cell type labels.

assay

Character. Assay to use from Seurat object. Default is NULL (default assay).

n_samples

Integer. Number of bulk samples to simulate. Default is 1000.

cells_per_sample

Integer. Cells per simulated sample. Default is 100.

sparse_fraction

Numeric. Fraction of sparse samples (0-1). Default is 0.5.

unknown_celltypes

Character vector. Cell types to merge into "Unknown". Default is NULL.

num_steps

Integer. Training steps per model. Default is 1000 (matches Python).

batch_size

Integer. Training batch size. Default is 128.

learning_rate

Numeric. Learning rate. Default is 0.0001.

validation_split

Numeric. Fraction for validation (0-1). Default is 0.

early_stopping

Logical. Enable early stopping. Default is FALSE.

patience

Integer. Early stopping patience. Default is 100.

var_cutoff

Numeric. Variance cutoff for gene filtering. Default is 0.1.

scaling

Character. Scaling method: "log_min_max", "log_zscore", or "none". Default is "log_min_max".

model_type

Character. "ensemble" or "single". Default is "ensemble".

architecture

Character. Architecture for single model: "m256", "m512", "m1024". Default is "m512".

device

Character. "auto", "cpu", or "cuda". Default is "auto".

save_model

Character. Path to save trained model. Default is NULL (don't save).

seed

Integer. Random seed. Default is 42.

verbose

Logical. Print progress. Default is TRUE.

n_cores

Integer. Cores for parallel simulation. Default is 1.

Value

A list containing:

model

The trained TorchDeconModel or TorchDeconEnsemble

predictions

Predicted cell fractions (if bulk_data provided)

simulation

The simulation object

processed

The processed training data

Examples

if (FALSE) { # \dontrun{
# Complete workflow with ensemble (default)
result <- RunTorchDecon(
  seurat_object = my_seurat,
  bulk_data = bulk_expression,
  celltype_col = "cell_type",
  n_samples = 2000,
  num_steps = 1000
)

# Single model with early stopping
result <- RunTorchDecon(
  seurat_object = my_seurat,
  bulk_data = bulk_expression,
  model_type = "single",
  architecture = "m1024",
  validation_split = 0.1,
  early_stopping = TRUE
)

# Get predictions
predictions <- result$predictions
} # }