📖 Documentation: https://zaoqu-liu.github.io/MAGICR
Overview
MAGICR is a native R implementation of the MAGIC (Markov Affinity-based Graph Imputation of Cells) algorithm for denoising and imputation of single-cell RNA sequencing (scRNA-seq) data.
Single-cell RNA sequencing enables transcriptomic profiling at single-cell resolution, but technical limitations result in sparse count matrices with prevalent dropout events. MAGIC addresses this challenge by leveraging the underlying manifold structure of the data through diffusion geometry, effectively recovering missing transcripts while preserving biological signal.
Algorithm
MAGIC performs data imputation through the following steps:
- Dimensionality Reduction: Optional PCA to reduce computational complexity
- Graph Construction: Build a k-nearest neighbor (kNN) graph based on cell-cell distances
- Kernel Computation: Construct an α-decaying kernel with adaptive bandwidth:
where is the distance to the -th nearest neighbor of cell .
- Diffusion Operator: Compute the row-stochastic Markov transition matrix:
- Imputation: Apply powered diffusion:
The diffusion time controls the degree of smoothing. When t = "auto", optimal is selected by minimizing Procrustes disparity between successive iterations.
Installation
From R-universe (Recommended)
install.packages("MAGICR", repos = "https://zaoqu-liu.r-universe.dev")From GitHub
# install.packages("remotes")
remotes::install_github("Zaoqu-Liu/MAGICR")Usage
Parameter Tuning
| Parameter | Default | Description |
|---|---|---|
knn |
5 | Number of nearest neighbors for bandwidth estimation |
knn_max |
15 | Maximum neighbors for graph construction |
decay |
1 | α parameter controlling kernel sharpness |
t |
3 | Diffusion time (integer or “auto”) |
npca |
100 | PCA components for distance computation |
solver |
“exact” | “exact” or “approximate” (PCA-space) |
distance |
“euclidean” | Distance metric: “euclidean”, “cosine”, “correlation” |
Methodological Details
Adaptive Bandwidth
The α-decaying kernel uses cell-specific bandwidth , defined as the distance to the -th nearest neighbor. This provides: - Local density adaptation - Robustness to varying cell densities - Preservation of rare cell populations
Diffusion Time Selection
When t = "auto", MAGIC iteratively applies the diffusion operator and monitors convergence using Procrustes disparity:
The algorithm terminates when the disparity falls below threshold (default: 0.001).
Computational Considerations
- Exact solver: Computes imputation in full gene space. Recommended for smaller datasets or when preserving all gene relationships is critical.
- Approximate solver: Operates in PCA space and projects back. Faster for large datasets but may introduce minor approximation errors.
Citation
If you use MAGICR in your research, please cite:
van Dijk, D., Sharma, R., Nainys, J., Yim, K., Kathail, P., Carr, A.J., Burdziak, C., Moon, K.R., Chaffer, C.L., Pattabiraman, D., Bierie, B., Mazutis, L., Wolf, G., Krishnaswamy, S., & Pe’er, D. (2018). Recovering Gene Interactions from Single-Cell Data Using Data Diffusion. Cell, 174(3), 716-729.e27. https://doi.org/10.1016/j.cell.2018.05.061
Related Resources
- Original Python MAGIC - Krishnaswamy Lab implementation
- PHATE - Dimensionality reduction using similar diffusion principles
- graphtools - Python library for graph-based data analysis
Acknowledgments
This implementation is based on the original MAGIC algorithm developed by the Krishnaswamy Lab at Yale University. The MATLAB implementation served as the primary reference for this native R port.