Skip to content

pymCADRE enables the reconstruction of tissue-specific metabolic models in Python using transcriptomic data and information of the network topology.

License

Notifications You must be signed in to change notification settings

draeger-lab/pymCADRE

Folders and files

NameName
Last commit message
Last commit date

Latest commit

4b6d407 · Mar 20, 2024
Apr 20, 2023
Apr 20, 2023
Apr 20, 2023
Oct 13, 2021
Dec 21, 2020
Mar 20, 2024
Nov 11, 2022
Apr 20, 2023
Dec 27, 2023
Jul 28, 2021
Oct 13, 2021
Apr 19, 2023

Repository files navigation

pymCADRE

License (LGPL version 3) Latest version Code Size Downloads of all releases

drawing

Authors : Nantia Leonidou

Publication

When using pymCADRE in a research work, please cite the following work:

Leonidou, N., Renz, A., Mostolizadeh, R., & Dräger, A. (2023). New workflow predicts drug targets against SARS-CoV-2 via metabolic changes in infected cells. PLOS Computational Biology, 19(3), e1010903. DOI

Overview.

The pymCADRE tool is an advanced re-implementation of the metabolic Context-specificity Assessed by Deterministic Reaction Evaluation (mCADRE) algorithm in Python. It constructs tissue-specific metabolic models by leveraging gene expression data and literature-based evidence, along with network topology information.

The reactions within the generic global model are being ranked, and the ones with the lowest supporting evidence for the tissue of interest are given the highest priority for removal:

GM, C, NC, P, Z, model_C = rank_reactions(model, G, U, confidence_scores, C_H_genes, method)

If the generic functionality test is passed, the model undergoes pruning, which results in a context-specific reconstruction:

PM, cRes = prune_model(GM, P, C, Z, eta, precursorMets, salvage_check, C_H_genes, method)

Installation

pip install pymcadre

Import module and sub-modules

import pymCADRE
# sub-module example
from pymCADRE.rank import *

Prerequisites

This tool has the following dependencies:

python >=3.8.5

Packages:

  • pandas
  • numpy
  • cobra
  • requests
  • os

Input data

  • model: COBRA model structure for the metabolic model of interest
  • precursorMets: list of precursor, key, metabolites in form of .txt file
  • confidence_scores: literature/experimental-based confidence assigned to reactions in model

Tissue-specific expression evidence:

  • G: list of Entrez IDs for all genes in model
  • U: list of ubiquity scores calculated for all genes in model
Optional Inputs
  • salvageCheck: flag whether to perform a functional check for the nucleotide salvage pathway (1) or not (0)
  • C_H_genes: list with Entrez IDs for genes with particularly strong evidence of activity in the tissue of interest
  • method: method to use internal optimizations, (1) flux variability analysis or (2) fastcc

Outputs

  • PM: pruned COBRA tissue-specific model
  • GM: COBRA model after removing blocked reactions from the input global model
  • C: core reactions in GM
  • NC: non-core reactions in GM
  • Z: reactions with zero expression across all samples after binarization
  • model_C: core reactions in the generic model (including blocked reactions)
  • pruneTime: total reaction pruning time
  • cRes: result of model checks (consistency/function) during pruning

Usage

To run pymCADRE, execute the notebook named main_pymcadre.ipynb or the python script named pymcadre.py. The scripts can be modified to the preferred parameters and input files. Jupyter notebooks with test runs and test scripts are also provided as reference points.

Additional material

PREDICATE (Prediction of Antiviral Targets):

Steps:

  • introduction of mutations in the reference sequence based on the protein sequences
  • calculation of the necessary stoichiometric coefficients for the final virus biomass functions
  • target detection using two approaches: reaction knock-outs and the host-derived enforcement
  • visualizations that could give insights into the dataset and a better understanding of the results.

The tool can be applied to either one or more nucleotide sequences and all existing RNA viruses. This makes it particularly advantageous and time-saving when studying multiple variants of a single virus. The number of genomic input sequences equals the number of the calculated VBOF.

To run the tool, set the constant variables to the file pathways where the desired files are stored.