Overview
iTALK is a computational framework for characterizing, comparing, and visualizing intercellular communication networks from transcriptomic data. The package integrates a curated ligand-receptor interaction database with statistical methods for identifying significant cell-cell communication events in both bulk and single-cell RNA sequencing datasets.
This enhanced version introduces automatic cross-species gene conversion, enabling seamless analysis of non-human species data (e.g., mouse) through ortholog mapping via Ensembl BioMart.
Installation
From R-Universe (Recommended)
install.packages("iTALK", repos = "https://zaoqu-liu.r-universe.dev")From GitHub
if (!requireNamespace("remotes", quietly = TRUE))
install.packages("remotes")
remotes::install_github("Zaoqu-Liu/iTALK")Key Features
1. Ligand-Receptor Interaction Database
iTALK incorporates a comprehensive database of 2,649 ligand-receptor pairs categorized into:
- Cytokines (inflammatory signaling)
- Growth factors (proliferation/differentiation)
- Checkpoint molecules (immune regulation)
- Other signaling molecules
2. Differential Expression Analysis
Supports multiple statistical methods for identifying differentially expressed genes:
| Method | Description | Use Case |
|---|---|---|
Wilcox |
Wilcoxon rank-sum test | General purpose, robust |
DESeq2 |
Negative binomial model | Bulk RNA-seq |
edgeR |
Negative binomial with empirical Bayes | Bulk RNA-seq |
MAST |
Two-part hurdle model | scRNA-seq (recommended) |
monocle |
Census-based analysis | scRNA-seq trajectory |
SCDE |
Bayesian differential expression | scRNA-seq |
DEsingle |
Zero-inflated model | scRNA-seq with dropouts |
3. Cross-Species Analysis
Automatic species detection and gene conversion:
- Identifies species from gene naming conventions (e.g.,
TGFB1= human,Tgfb1= mouse) - Maps orthologs via Ensembl BioMart (85-95% mapping rate for mouse ↔︎ human)
- Intelligent caching for efficient repeated analyses (~15s initial query, <1s cached)
Usage
Basic Workflow
library(iTALK)
# Load expression data (cells × genes matrix with 'cell_type' column)
data <- read.table("expression_data.txt", header = TRUE)
# Identify highly expressed genes per cell type
highly_expr <- rawParse(data, top_genes = 50, stats = "mean")
# Find ligand-receptor pairs
lr_pairs <- FindLR(
data_1 = highly_expr,
datatype = "mean count",
comm_type = "cytokine"
)
# Visualize interactions
LRPlot(lr_pairs, datatype = "mean count")
NetView(lr_pairs, col = cell_colors)Differential Expression Workflow
# Add comparison groups to data
data$compare_group <- sample(c("Control", "Treatment"), nrow(data), replace = TRUE)
# Calculate differential expression (per cell type)
deg_results <- DEG(
data = subset(data, cell_type == "T_cell"),
method = "Wilcox",
q_cut = 0.05
)
# Find differentially expressed ligand-receptor pairs
lr_deg <- FindLR(
data_1 = deg_results,
datatype = "DEG",
comm_type = "checkpoint"
)
# Visualize with fold-change information
LRPlot(lr_deg, datatype = "DEG")Cross-Species Analysis (Mouse Data)
# Mouse data with genes like Tgfb1, Vegfa, Cd8a
mouse_expr <- rawParse(mouse_data, top_genes = 50)
# Automatic detection and conversion
lr_pairs <- FindLR(
data_1 = mouse_expr,
datatype = "mean count",
comm_type = "growth factor"
)
# Output:
# Detected species: Mus_musculus (confidence: 95.2%)
# Converting mouse genes to human orthologs...
# Mapping complete: 847/1000 genes mapped (84.7%)Manual Species Conversion
# Detect species
detection <- detect_species(c("Tgfb1", "Vegfa", "Ctnnb1"))
# $species: "Mus_musculus"
# $confidence: 1.0
# Convert genes
conversion <- convert_species_biomart(
genes = unique(mouse_data$gene),
from_species = "Mus_musculus",
to_species = "Homo_sapiens",
ensembl_version = 103, # Fixed version for reproducibility
cache = TRUE
)
# Access results
conversion$mapping # data.frame: from_gene, to_gene
conversion$unmapped # genes without orthologs
conversion$stats # mapping statisticsFunction Reference
Core Functions
| Function | Description |
|---|---|
rawParse() |
Extract top expressed genes per cell type |
DEG() |
Differential expression analysis |
FindLR() |
Identify ligand-receptor pairs |
LRPlot() |
Circos visualization of interactions |
NetView() |
Network visualization of cell communication |
TimePlot() |
Time-series interaction dynamics |
Species Conversion Functions
| Function | Description |
|---|---|
detect_species() |
Auto-detect species from gene names |
convert_species_biomart() |
Convert genes between species |
convert_expression_matrix() |
Convert expression matrix |
convert_data_species() |
Convert iTALK data frames |
Performance
Benchmarks (typical workstation, 1000 genes):
| Operation | Time | Notes |
|---|---|---|
| Species detection | <0.1s | Pattern matching |
| BioMart query (first) | ~15s | Network dependent |
| BioMart query (cached) | <1s | Local cache |
FindLR() |
<5s | Database matching |
LRPlot() |
<3s | Rendering |
Mapping Rates: - Mouse → Human: 85-95% - Rat → Human: 80-90% - Other mammals: 70-85%
Citation
If you use iTALK in your research, please cite:
Wang Y, Wang R, Zhang S, Song S, Jiang C, Han G, Wang M, Ajani J, Futreal A, Wang L. iTALK: an R Package to Characterize and Illustrate Intercellular Communication. bioRxiv 507871; doi: https://doi.org/10.1101/507871
For cross-species analysis, please also cite:
Cunningham F, Allen JE, Allen J, et al. Ensembl 2022. Nucleic Acids Research. 2022;50(D1):D988-D995.
Authors
Current Maintainer:
Zaoqu Liu, Ph.D. (liuzaoqu@163.com)
Original Authors:
Yuanxin Wang (MD Anderson Cancer Center)
Changelog
See NEWS.md for detailed version history.
v0.1.1 (2026-01-23) - Fixed critical loop bug in DEG() affecting multi-group comparisons - Fixed gene-result alignment issues in all DE methods - Added robust handling for zero-expression values in logFC calculation - Updated deprecated functions for ggplot2/igraph compatibility - Added tibble dependency - Performance optimizations for vectorized operations