`p.adjust()` function

Let’s start by looking at the help of the p.adjust function

?p.adjust

Singhania `voom-limma` analysis

Now (re)compute the raw p-values from the empircial Bayes moderated t-test performed by the voom-limmaanalysis pipeline on the RNA-seq data on Tuberculosis infection from the Berry cohort in London analyzed in Singhania et al..

library(dplyr)
library(edgeR)
load("SinghaniaTB_data.Rdata")
# removing genes never observed
gene_no0 <- rowSums(dplyr::select(expr_singhania_raw, -dplyr::starts_with("Gene"))) > 
    0
expr_singhania_raw_no0 <- dplyr::filter(expr_singhania_raw, gene_no0)
# filter only genes with enough cpm
mycpmfun <- function(x) {
    return((x/sum(x)) * 10^6)
}
cpm <- apply(dplyr::select(expr_singhania_raw_no0, -starts_with("Gene")), MARGIN = 2, 
    FUN = mycpmfun)
expr_singhania_raw_no0_cpm2p <- dplyr::filter(expr_singhania_raw_no0, rowSums(cpm > 
    2) > 5)
# filter only protein coding genes
gene_pc <- expr_singhania_raw_no0_cpm2p$Gene_biotype == "protein_coding"
expr_singhania_raw_no0_cpm2p_prot <- filter(expr_singhania_raw_no0_cpm2p, gene_pc)
# normalize the data
singhaniaDGE <- DGEList(counts = select(expr_singhania_raw_no0_cpm2p_prot, -starts_with("Gene")), 
    samples = infos_singhania, genes = select(expr_singhania_raw_no0_cpm2p_prot, 
        starts_with("Gene")))
singhaniaDGE_normfact <- edgeR::calcNormFactors(singhaniaDGE, method = "TMM")
# select only the london samples
singhaniaLondonDGE_normfact <- singhaniaDGE_normfact[, singhaniaDGE_normfact$samples$Location == 
    "London"]
# build the desired design matrix
design <- as.data.frame(model.matrix(~Status, data = infos_singhania[infos_singhania$Location == 
    "London", ]))
design <- dplyr::mutate(design, ActiveTB = 1 - (StatusLTBI + StatusControl))
design <- dplyr::select(design, -starts_with("Status"))

library(limma)
# compute the precision weights necessary to account for heteroskedasticity
v <- voom(singhaniaLondonDGE_normfact, design, plot = FALSE)
# fit the gaussian linear model with precision weights v
fit <- lmFit(v, design)
# perform an Empirical Bayes modified t-test
fit <- eBayes(fit, robust = TRUE)

Multiple testing adjustment

Retreive the raw p-values from the output of the limma::eBayes() function, and sort them in ascending order using the sort() function. Store them in a data frame alongside their rank.
Using the various multiple testing correction methods from the class with the p.adjust() function, compare the number of significant genes.
Display the different thresholds on a graph with all the pvalues on log-scale.
Comment on the differences. Which correction would you chose and why ?

TU OMICS: Multiple testing

Practicals

Boris Hejblum

October 15^th, 2019

`p.adjust()` function

Singhania `voom-limma` analysis

Multiple testing adjustment

TU OMICS: Multiple testing

Practicals

Boris Hejblum

October 15th, 2019

p.adjust() function

Singhania voom-limma analysis

Multiple testing adjustment

October 15^th, 2019

`p.adjust()` function

Singhania `voom-limma` analysis