Add msig as enrich db option #62

thomas-keller · 2025-01-10T14:55:55Z

Hi,
here is my initial idea for letting msig (or in principle other databases) be the database used in shake functions, and later downstream in ggs_graph, etc.

To avoid breaking changes, I've made the following changes

enhance_table checks if de object is deseq. If not, don't have a contrast in title
shake_gsvaResult() adds the msig full description as a column (gs_fulldesc)
In ggs_graph, use gs_fulldesc if present instead of pulling from GOTERM
in ggs_graph, if gs_fulldesc present, change link in tool tip to link to msig.

What are your thoughts? My focus was on the enhance_table and ggs_graph for my immediate research needs, but if other functions don't rely on goterm descriptions I think they will work as normal.

I'll work on an example limma/gsva prep on the macrophage data if this seems like a worthwhile addition.

thomas-keller · 2025-01-20T23:03:20Z

Hi @federicomarini , could you let me know when/if you have a chance to review the initial adjustments to the mentioned functions?

federicomarini · 2025-01-21T13:38:47Z

Hi there @thomas-keller - sorry, I did not have time yet to dig through this in detail, been a busy week.
I will give it a spin in the next days!
Thanks for prompting the idea - and the implementation 😉

thomas-keller · 2025-01-21T20:26:56Z

Oh yeah, absolutely no rush @federicomarini . I was just circling back after making an example with the macrophage data. I have never really messed with CI before, so I don't really know how to debug/fix stuff for the different builds.

I realized when making the example code below that I didn't roxygenise to update the exports/etc. However, I did run roxygenise on my test branch, but it didn't update anything except deprecated.Rd. If you have any advice or resources I can read up on, let me know. Helping out with libraries is pretty new to me.

Anyway, here is the example code that runs on the test branch (on my windows machine):

library(msigdbr)
library(GSVA)
library(limma)
library("macrophage")
library("DESeq2")
library(dplyr)
library(GeneTonic)
library(org.Hs.eg.db)
library(plotly)
library(visNetwork)

data("gse", package = "macrophage")

m_df<- msigdbr::msigdbr(species='Homo sapiens',category='H')

gse2<-gse
rownames(gse2)<-substr(rownames(gse2), 1, 15)
gse2=gse2[!duplicated(rownames(gse2)),]
ge <- AnnotationDbi::mapIds(org.Hs.eg.db::org.Hs.eg.db, keys = rownames(gse2),column = c('SYMBOL'),keytype = 'ENSEMBL')

gse2=gse2[!is.na(ge),]
ge=ge[!is.na(ge)]
gse2=gse2[!duplicated(ge),]
ens=rownames(gse2)
ens=ens[!duplicated(ge)]
ge=ge[!duplicated(ge)]

ge=ge[!duplicated(ens)]
gse2=gse2[!duplicated(ens),]
ens=ens[!duplicated(ens)]
rownames(gse2)=as.character(ge)
keep <- rowSums(assays(gse2)[[1]] >= 10) >= 6
gse2 <- gse2[keep, ]
ge=ge[keep]
ens=ens[keep]

#gse2 has rownames as symbols to match pathway info
dds_macrophage <- DESeqDataSet(gse2, design = ~line + condition)
vsd<-vst(dds_macrophage,blind=FALSE)

#same vst data with ensembl ids instead for limma/limma2df
vsde=vsd
rownames(vsde)=ens

pwys=split(as.character(m_df$gene_symbol), m_df$gs_name)
#m_t2gc2 <- msigdbr::msigdbr(species = "Homo sapiens", category = "C2")
pmaph=m_df %>% distinct(gs_name,.keep_all=T)
pmaph=pmaph[,c(3,15)]
colnames(pmaph)=c("pathway","description")

#gsva setup and call
gd=assay(vsd)
gp1=gsvaParam(gd,pwys,minSize=5,maxSize=500,kcdf='Gaussian')
gr1=gsva(gp1,verbose=FALSE)

#generic limma call
mod<-model.matrix(~vsd$line+vsd$condition)
gset=geneSets(gr1)

lfit <- lmFit(gr1, mod)
lfit <- eBayes(lfit)
lres <- decideTests(lfit, p.value=0.05)

#pull out the Infg vs naive contrast (coefficient 7)
tt <- topTable(lfit, coef=7, n=Inf)
depw <- tt[tt$adj.P.Val <= 0.05,]

#use limma for gene DE as example
lfitg <-lmFit(assay(vsde),mod)
lfitg <- eBayes(lfitg,trend=TRUE)
lgres <- decideTests(lfitg,p.value=0.05)
ttg <- topTable(lfitg,coef=7,n=Inf)

anno_dfm <- data.frame(
gene_id = rownames(vsde),
gene_name = as.character(mapIds(org.Hs.eg.db,
keys = rownames(vsde),
column = "SYMBOL",
keytype = "ENSEMBL"
)),
stringsAsFactors = FALSE,
row.names=rownames(vsde)
)

#new functions not exported...
#didn't roxygenise so not surprising
#however, did try running roxygenise on the msig branch, and namespace didn't update
#???

res_de=GeneTonic:::limma2df(ttg)
res_enrich=GeneTonic:::shake_gsvaResult(depw,res_de,m_df,gset,anno_dfm

em=enhance_table(res_enrich,res_de,anno_dfm,chars_limit=40)
ggplotly(em)
plot(em)

ggs=ggs_graph(res_enrich,res_de,anno_dfm)

ggs %>%
visIgraph() %>%
visOptions(highlightNearest = list(enabled = TRUE,
degree = 1,
hover = TRUE),
nodesIdSelection = TRUE)

thomas-keller added 2 commits January 9, 2025 20:23

add limma2df and shake_gsvaResult functions

d76e3a5

initial changes to allow for msig as enrich db

1c716e6

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add msig as enrich db option #62

Add msig as enrich db option #62

thomas-keller commented Jan 10, 2025

thomas-keller commented Jan 20, 2025

federicomarini commented Jan 21, 2025

thomas-keller commented Jan 21, 2025 •

edited

Loading

Add msig as enrich db option #62

Are you sure you want to change the base?

Add msig as enrich db option #62

Conversation

thomas-keller commented Jan 10, 2025

thomas-keller commented Jan 20, 2025

federicomarini commented Jan 21, 2025

thomas-keller commented Jan 21, 2025 • edited Loading

thomas-keller commented Jan 21, 2025 •

edited

Loading