This function loads the data as a dataframe, and method as a string. It assumes that each line contains gene expression profile of one single cell, and each column contains the one single gene expression profile in different cells. The dataframe should also contain the cell type information with column name 'cell_type', as well as group information as 'compare_group' Batch information as 'batch' is optional. If included, users may want to use the raw count data for later analysis. Differential expressed genes will be called within each cell type by the method users select. For bulk RNAseq, we provide edgeR, DESeq2. And for scRNA-seq, popular methods in packages scde, monocle, DEsingle and MAST are available.
Usage
DEG(
data,
method,
min_gene_expressed = 0,
min_valid_cells = 0,
contrast = NULL,
q_cut = 0.05,
add = TRUE,
top = 50,
stats = "mean",
...
)Arguments
- data
Input raw or normalized count data with column 'cell_type' and 'compare_group'
- method
Method used to call DEGenes. Available options are:
Wilcox: Wilcoxon rank sum test
DESeq2: Negative binomial model based differential analysis (Love et al, Genome Biology, 2014)
SCDE: Bayesian approach to single-cell differential expression analysis (Kharchenko et al, Nature Method, 2014)
monocle: Census based differential analysis (Qiu et al, Nature Methods, 2017)
edgeR: Negative binomial distributions, including empirical Bayes estimation, exact tests, generalized linear models and quasi-likelihood tests based differential analysis (McCarthy et al, Nucleic Acids Research, 2012)
DESingle: Zero-Inflated Negative Binomial model to estimate the proportion of real and dropout zeros and to define and detect the 3 types of DE genes (Miao et al, Bioinformatics, 2018)
MAST: GLM-framework that treates cellular detection rate as a covariate (Finak et al, Genome Biology, 2015)
- min_gene_expressed
Genes expressed in minimum number of cells
- min_valid_cells
Minimum number of genes detected in the cell
- contrast
String vector specifying the contrast to be tested against the log2-fold-change threshold
- q_cut
Cut-off for q value
- add
Whether add genes that are not differentially expressed but highly expressed for finding the significant pairs later
- top
Same as in function rawParse
- stats
Same as in function rawParse
- ...
Additional arguments passed to the specific differential expression test function