Skip to contents

R-Universe License: CC BY-NC-SA 4.0 R Version

Overview

iTALK is a computational framework for characterizing, comparing, and visualizing intercellular communication networks from transcriptomic data. The package integrates a curated ligand-receptor interaction database with statistical methods for identifying significant cell-cell communication events in both bulk and single-cell RNA sequencing datasets.

This enhanced version introduces automatic cross-species gene conversion, enabling seamless analysis of non-human species data (e.g., mouse) through ortholog mapping via Ensembl BioMart.


Installation

install.packages("iTALK", repos = "https://zaoqu-liu.r-universe.dev")

From GitHub

if (!requireNamespace("remotes", quietly = TRUE))
    install.packages("remotes")

remotes::install_github("Zaoqu-Liu/iTALK")

Dependencies

Core dependencies are installed automatically:

Package Purpose
biomaRt Ensembl ortholog queries
circlize Circular visualization
igraph Network analysis
ggplot2 Time-series plotting
dplyr Data manipulation

Key Features

1. Ligand-Receptor Interaction Database

iTALK incorporates a comprehensive database of 2,649 ligand-receptor pairs categorized into:

  • Cytokines (inflammatory signaling)
  • Growth factors (proliferation/differentiation)
  • Checkpoint molecules (immune regulation)
  • Other signaling molecules

2. Differential Expression Analysis

Supports multiple statistical methods for identifying differentially expressed genes:

Method Description Use Case
Wilcox Wilcoxon rank-sum test General purpose, robust
DESeq2 Negative binomial model Bulk RNA-seq
edgeR Negative binomial with empirical Bayes Bulk RNA-seq
MAST Two-part hurdle model scRNA-seq (recommended)
monocle Census-based analysis scRNA-seq trajectory
SCDE Bayesian differential expression scRNA-seq
DEsingle Zero-inflated model scRNA-seq with dropouts

3. Cross-Species Analysis

Automatic species detection and gene conversion:

  • Identifies species from gene naming conventions (e.g., TGFB1 = human, Tgfb1 = mouse)
  • Maps orthologs via Ensembl BioMart (85-95% mapping rate for mouse ↔︎ human)
  • Intelligent caching for efficient repeated analyses (~15s initial query, <1s cached)

4. Visualization

  • Circos plots (LRPlot): Directional ligand-receptor interactions
  • Network graphs (NetView): Cell-cell communication topology
  • Time-series plots (TimePlot): Dynamic interaction changes

Usage

Basic Workflow

library(iTALK)

# Load expression data (cells × genes matrix with 'cell_type' column)
data <- read.table("expression_data.txt", header = TRUE)

# Identify highly expressed genes per cell type
highly_expr <- rawParse(data, top_genes = 50, stats = "mean")

# Find ligand-receptor pairs
lr_pairs <- FindLR(
  data_1 = highly_expr,
  datatype = "mean count",
  comm_type = "cytokine"
)

# Visualize interactions
LRPlot(lr_pairs, datatype = "mean count")
NetView(lr_pairs, col = cell_colors)

Differential Expression Workflow

# Add comparison groups to data
data$compare_group <- sample(c("Control", "Treatment"), nrow(data), replace = TRUE)

# Calculate differential expression (per cell type)
deg_results <- DEG(
  data = subset(data, cell_type == "T_cell"),
  method = "Wilcox",
  q_cut = 0.05
)

# Find differentially expressed ligand-receptor pairs
lr_deg <- FindLR(
  data_1 = deg_results,
  datatype = "DEG",
  comm_type = "checkpoint"
)

# Visualize with fold-change information
LRPlot(lr_deg, datatype = "DEG")

Cross-Species Analysis (Mouse Data)

# Mouse data with genes like Tgfb1, Vegfa, Cd8a
mouse_expr <- rawParse(mouse_data, top_genes = 50)

# Automatic detection and conversion
lr_pairs <- FindLR(
  data_1 = mouse_expr,
  datatype = "mean count",
  comm_type = "growth factor"
)
# Output:
#   Detected species: Mus_musculus (confidence: 95.2%)
#   Converting mouse genes to human orthologs...
#   Mapping complete: 847/1000 genes mapped (84.7%)

Manual Species Conversion

# Detect species
detection <- detect_species(c("Tgfb1", "Vegfa", "Ctnnb1"))
# $species: "Mus_musculus"
# $confidence: 1.0

# Convert genes
conversion <- convert_species_biomart(
  genes = unique(mouse_data$gene),
  from_species = "Mus_musculus",
  to_species = "Homo_sapiens",
  ensembl_version = 103,  # Fixed version for reproducibility
  cache = TRUE
)

# Access results
conversion$mapping      # data.frame: from_gene, to_gene
conversion$unmapped     # genes without orthologs
conversion$stats        # mapping statistics

Function Reference

Core Functions

Function Description
rawParse() Extract top expressed genes per cell type
DEG() Differential expression analysis
FindLR() Identify ligand-receptor pairs
LRPlot() Circos visualization of interactions
NetView() Network visualization of cell communication
TimePlot() Time-series interaction dynamics

Species Conversion Functions

Function Description
detect_species() Auto-detect species from gene names
convert_species_biomart() Convert genes between species
convert_expression_matrix() Convert expression matrix
convert_data_species() Convert iTALK data frames

Performance

Benchmarks (typical workstation, 1000 genes):

Operation Time Notes
Species detection <0.1s Pattern matching
BioMart query (first) ~15s Network dependent
BioMart query (cached) <1s Local cache
FindLR() <5s Database matching
LRPlot() <3s Rendering

Mapping Rates: - Mouse → Human: 85-95% - Rat → Human: 80-90% - Other mammals: 70-85%


Citation

If you use iTALK in your research, please cite:

Wang Y, Wang R, Zhang S, Song S, Jiang C, Han G, Wang M, Ajani J, Futreal A, Wang L. iTALK: an R Package to Characterize and Illustrate Intercellular Communication. bioRxiv 507871; doi: https://doi.org/10.1101/507871

For cross-species analysis, please also cite:

Cunningham F, Allen JE, Allen J, et al. Ensembl 2022. Nucleic Acids Research. 2022;50(D1):D988-D995.


Authors

Current Maintainer:
Zaoqu Liu, Ph.D. ()

Original Authors:
Yuanxin Wang (MD Anderson Cancer Center)


License

This package is licensed under CC BY-NC-SA 4.0.


Support


Changelog

See NEWS.md for detailed version history.

v0.1.1 (2026-01-23) - Fixed critical loop bug in DEG() affecting multi-group comparisons - Fixed gene-result alignment issues in all DE methods - Added robust handling for zero-expression values in logFC calculation - Updated deprecated functions for ggplot2/igraph compatibility - Added tibble dependency - Performance optimizations for vectorized operations