In this document you will find a short example of the analysis using cox robust extimation in MOSClip R package.

We used some of the MOSClip functions to identify modules (module analysis) describing the pathway associated with Progression Free Survival (PFS) in Ovarian Cancer.

This tutorial implies that you have already read the main tutorial and moves quite fast to the point.

Prepare the multi-omic dataset and pathways

First of all, we load the necessary libraries.

library(org.Hs.eg.db)
library(pheatmap)
library(RColorBrewer)
library(EDASeq)
library(graphite)
library(houseOfClipUtility)
library(MOSClip)

You should already have the TCGA dataset. Otherwise download it from example datasets.

The provided dataset includes matrices, genes per patients, of methylation status, somatic mutations, CNVs, and transcript expression levels of the TCGA ovarian cancer samples.

Move the downloaded file inside a directory called downloadTCGA in your working directory.

Now, we can load the preprocessed data.

load("downloadTCGA/TCGA-OV-hg38-preprocessed-externalSurv.RData")

We create a directory to store the data results, checking if the directory already exists.

dirname = "MOSresults"
if (!file.exists(dirname)) {
    dir.create(dirname)
}

We move pretty fast to prepare ‘fup’ (short of followup) from the table by Liu et al, Cell, 2018 and all the matrices that we need to perform the analysis (for more detail please refer to the main tutorial).

survAnnotations <- fup$pfs

expression <- rnaSeq
row.names(expression) <- paste0("ENTREZID:", row.names(expression))
mutation <- mutations$data
row.names(mutation) <- paste0("ENTREZID:", row.names(mutation))
names(metClustValues$eMap) <- paste0("ENTREZID:", row.names(metClustValues$eMap))
row.names(cnv) <- paste0("ENTREZID:", row.names(cnv))

Now, we are about to intersect different datasets and annotations to select those patient samples having all the four omics profiled.

survAnnot <- na.omit(survAnnotations)
patients <- row.names(survAnnot)
patients <- intersect(patients, colnames(expression))
patients <- intersect(patients, colnames(metClustValues$MET_Cancer_Clustered))
patients <- intersect(patients, colnames(mutation))
patients <- intersect(patients, colnames(cnv))

survAnnot <- survAnnot[patients, , drop = F]

if (!file.exists("survAnnot.RData")) {
    file = paste0(dirname, "/survAnnot-", as.character(Sys.Date()), ".RData")
    link = "survAnnot.RData"
    save(survAnnot, file = file)
    file.symlink(file, link)
}

expression <- expression[, patients, drop = F]
keep = apply(expression >= 100, 1, any)
expNorm <- betweenLaneNormalization(expression[keep, , drop = F], which = "upper")
pseudoExpNorm <- log2(expNorm + 1)

methylation <- metClustValues
methylation$MET_Cancer_Clustered <- methylation$MET_Cancer_Clustered[, patients, 
    drop = F]

mutation <- mutation[, patients, drop = F]
cnv <- cnv[, patients, drop = F]

After preparing and filtering the matrices we prepare the pathway knowledge-base from graphite R package. As for the main tutorial we choose reactome and we transform the data for MOSClip.

if (file.exists("reactome-entrez.RData")) {
    load("reactome-entrez.RData")
} else {
    reactome <- pathways("hsapiens", "reactome")
    reactome <- convertIdentifiers(reactome, "entrez")
    file = paste0(dirname, "/reactome-entrez-", as.character(Sys.Date()), ".RData")
    link = "reactome-entrez.RData"
    save(reactome, file = file)
    file.symlink(file, link)
}

We selected the pathway dimension.

nodesLength <- sapply(reactome, function(g) {
    length(intersect(graphite::nodes(g), row.names(pseudoExpNorm)))
})
reactSmall <- reactome[nodesLength >= 20 & nodesLength <= 100]

Prepare the input data: the Omics object

At this step we are almost ready to run MOSClip analysis and we create the Omics object to wrap together expression, methylation, mutation and CNV matrices.

For each matrix we need to indicate also the data reduction strategy we want to apply.

We perform the same choices described in the main tutorial.

multiOmics <- Omics(data = list(expr = pseudoExpNorm, met = methylation$MET_Cancer_Clustered, 
    mut = mutation, cnv = cnv), methods = c("summarizeWithPca", "summarizeInCluster", 
    "summarizeToNumberOfEvents", "summarizeToNumberOfDirectionalEvents"), specificArgs = list(pcaArgs = list(name = "exp", 
    shrink = "FALSE", method = "sparse", maxPCs = 3), clusterArgs = list(name = "met", 
    dict = methylation$eMap, max_cluster_number = 3), countEvent = list(name = "mut", 
    min_prop = 0.05), cnvAgv = list(name = "cnv", min_prop = 0.05)))

MOSClip module analysis

We are going to perform the multi-omics survival analysis on pathway modules.

The analysis is quite long. For demonstration purpose we analyze the first 20 pathways.

genesToConsider <- row.names(pseudoExpNorm)

if (file.exists("multiOmicsReactome_robust.RData")) {
    load("multiOmicsReactome_robust.RData")
} else {
    multiOmicsReactome <- lapply(reactSmall[1:20], function(g) {
        # print(g@title) # uncomment this to see the progression along pathways
        set.seed(1234)
        fcl = multiOmicsSurvivalModuleTest(multiOmics, g, survAnnot, useThisGenes = genesToConsider, 
            robust = TRUE)
        fcl
    })
    file = paste0(dirname, "/multiOmicsReactome_robust-", as.character(Sys.Date()), 
        ".RData")
    link = "multiOmicsReactome_robust.RData"
    save(multiOmicsReactome, file = file)
    file.symlink(file, link)
}

The analysis is done now. We can perform the module summary now.

Module summary

Using the function multiPathwayModuleReport() we can plot the tabular summary of the top 6 modules selected by p-value of the Cox proportional hazard model.

moduleSummary <- multiPathwayModuleReport(multiOmicsReactome, priority_to = c("exp", 
    "met", "mut", "cnv"))
head(moduleSummary)
Top 6 modules by overall pvalue
pvalue expPC1 expPC2 expPC3 met2k2 met3k2 met3k3 mut cnvNEG cnvPOS
Activation of BH3-only proteins 0.06 0.2 0.41 0.03 0.04 NA NA NA 0.36 0.19
Opioid Signalling 0.06 0.39 NA NA NA 0.21 0 0.46 NA 0.79
G-protein mediated events 0.06 0.39 NA NA NA 0.21 0 0.46 NA 0.79
PLC beta mediated events 0.06 0.39 NA NA NA 0.21 0 0.46 NA 0.79
Intrinsic Pathway for Apoptosis 0.07 0.03 0.25 NA 0.23 NA NA NA 0.71 0.38
Intrinsic Pathway for Apoptosis 0.07 0.29 0.42 0.03 0.01 NA NA 0.72 0.33 0.36