Converts continuous gene expression values to binary (0/1) using various methods. Used by the binSpect method.
Usage
binarize_expression(
expr_matrix,
method = c("kmeans", "rank", "median", "mean"),
rank_percent = 30,
n_threads = 1L,
verbose = FALSE
)Arguments
- expr_matrix
Numeric matrix of gene expression. Rows are genes, columns are spots/cells.
- method
Character string specifying binarization method.
"kmeans"(default): Use k-means clustering (k=2) to separate high and low expression"rank": Binarize based on expression rank percentile"median": Values above median are set to 1"mean": Values above mean are set to 1
- rank_percent
Numeric. For
method = "rank", the percentile threshold (0-100). Values in the toprank_percentpercent are set to 1. Default is 30.- n_threads
Integer. Number of threads for parallel computation. Default is 1.
- verbose
Logical. Whether to print progress. Default is FALSE.
Details
K-means method: For each gene, k-means clustering with k=2 is applied. The cluster with higher mean expression is labeled as 1, the other as 0.
Rank method:
For each gene, spots are ranked by expression. The top rank_percent
percent are labeled as 1.
Examples
# Create example expression matrix
expr <- matrix(rpois(1000, lambda = 10), nrow = 10, ncol = 100)
rownames(expr) <- paste0("gene_", 1:10)
# Binarize using k-means
bin_kmeans <- binarize_expression(expr, method = "kmeans")
# Binarize using rank (top 20%)
bin_rank <- binarize_expression(expr, method = "rank", rank_percent = 20)