Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add msig as enrich db option #62

Open
wants to merge 2 commits into
base: devel
Choose a base branch
from

Conversation

thomas-keller
Copy link

Hi,
here is my initial idea for letting msig (or in principle other databases) be the database used in shake functions, and later downstream in ggs_graph, etc.

To avoid breaking changes, I've made the following changes

  1. enhance_table checks if de object is deseq. If not, don't have a contrast in title
  2. shake_gsvaResult() adds the msig full description as a column (gs_fulldesc)
  3. In ggs_graph, use gs_fulldesc if present instead of pulling from GOTERM
  4. in ggs_graph, if gs_fulldesc present, change link in tool tip to link to msig.

What are your thoughts? My focus was on the enhance_table and ggs_graph for my immediate research needs, but if other functions don't rely on goterm descriptions I think they will work as normal.

I'll work on an example limma/gsva prep on the macrophage data if this seems like a worthwhile addition.

@thomas-keller
Copy link
Author

Hi @federicomarini , could you let me know when/if you have a chance to review the initial adjustments to the mentioned functions?

@federicomarini
Copy link
Owner

Hi there @thomas-keller - sorry, I did not have time yet to dig through this in detail, been a busy week.
I will give it a spin in the next days!
Thanks for prompting the idea - and the implementation 😉

@thomas-keller
Copy link
Author

thomas-keller commented Jan 21, 2025

Oh yeah, absolutely no rush @federicomarini . I was just circling back after making an example with the macrophage data. I have never really messed with CI before, so I don't really know how to debug/fix stuff for the different builds.

I realized when making the example code below that I didn't roxygenise to update the exports/etc. However, I did run roxygenise on my test branch, but it didn't update anything except deprecated.Rd. If you have any advice or resources I can read up on, let me know. Helping out with libraries is pretty new to me.

Anyway, here is the example code that runs on the test branch (on my windows machine):

library(msigdbr)
library(GSVA)
library(limma)
library("macrophage")
library("DESeq2")
library(dplyr)
library(GeneTonic)
library(org.Hs.eg.db)
library(plotly)
library(visNetwork)

data("gse", package = "macrophage")

m_df<- msigdbr::msigdbr(species='Homo sapiens',category='H')

gse2<-gse
rownames(gse2)<-substr(rownames(gse2), 1, 15)
gse2=gse2[!duplicated(rownames(gse2)),]
ge <- AnnotationDbi::mapIds(org.Hs.eg.db::org.Hs.eg.db, keys = rownames(gse2),column = c('SYMBOL'),keytype = 'ENSEMBL')

gse2=gse2[!is.na(ge),]
ge=ge[!is.na(ge)]
gse2=gse2[!duplicated(ge),]
ens=rownames(gse2)
ens=ens[!duplicated(ge)]
ge=ge[!duplicated(ge)]

ge=ge[!duplicated(ens)]
gse2=gse2[!duplicated(ens),]
ens=ens[!duplicated(ens)]
rownames(gse2)=as.character(ge)
keep <- rowSums(assays(gse2)[[1]] >= 10) >= 6
gse2 <- gse2[keep, ]
ge=ge[keep]
ens=ens[keep]

#gse2 has rownames as symbols to match pathway info
dds_macrophage <- DESeqDataSet(gse2, design = ~line + condition)
vsd<-vst(dds_macrophage,blind=FALSE)

#same vst data with ensembl ids instead for limma/limma2df
vsde=vsd
rownames(vsde)=ens

pwys=split(as.character(m_df$gene_symbol), m_df$gs_name)
#m_t2gc2 <- msigdbr::msigdbr(species = "Homo sapiens", category = "C2")
pmaph=m_df %>% distinct(gs_name,.keep_all=T)
pmaph=pmaph[,c(3,15)]
colnames(pmaph)=c("pathway","description")

#gsva setup and call
gd=assay(vsd)
gp1=gsvaParam(gd,pwys,minSize=5,maxSize=500,kcdf='Gaussian')
gr1=gsva(gp1,verbose=FALSE)

#generic limma call
mod<-model.matrix(~vsd$line+vsd$condition)
gset=geneSets(gr1)

lfit <- lmFit(gr1, mod)
lfit <- eBayes(lfit)
lres <- decideTests(lfit, p.value=0.05)

#pull out the Infg vs naive contrast (coefficient 7)
tt <- topTable(lfit, coef=7, n=Inf)
depw <- tt[tt$adj.P.Val <= 0.05,]

#use limma for gene DE as example
lfitg <-lmFit(assay(vsde),mod)
lfitg <- eBayes(lfitg,trend=TRUE)
lgres <- decideTests(lfitg,p.value=0.05)
ttg <- topTable(lfitg,coef=7,n=Inf)

anno_dfm <- data.frame(
gene_id = rownames(vsde),
gene_name = as.character(mapIds(org.Hs.eg.db,
keys = rownames(vsde),
column = "SYMBOL",
keytype = "ENSEMBL"
)),
stringsAsFactors = FALSE,
row.names=rownames(vsde)
)

#new functions not exported...
#didn't roxygenise so not surprising
#however, did try running roxygenise on the msig branch, and namespace didn't update
#???

res_de=GeneTonic:::limma2df(ttg)
res_enrich=GeneTonic:::shake_gsvaResult(depw,res_de,m_df,gset,anno_dfm

em=enhance_table(res_enrich,res_de,anno_dfm,chars_limit=40)
ggplotly(em)
plot(em)

ggs=ggs_graph(res_enrich,res_de,anno_dfm)

ggs %>%
visIgraph() %>%
visOptions(highlightNearest = list(enabled = TRUE,
degree = 1,
hover = TRUE),
nodesIdSelection = TRUE)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants