Skip to content

Latest commit

 

History

History
310 lines (277 loc) · 12.5 KB

README.md

File metadata and controls

310 lines (277 loc) · 12.5 KB
logo

MINDS is a framework designed to integrate multimodal oncology data. It queries and integrates data from multiple sources, including clinical data, genomic data, and imaging data from the NIH NCI CRDC and TCIA portals.

Note

We are currently updating MINDS to include more data sources and improve the user experience. If you have any suggestions or would like to contribute, please feel free to reach out to us. Here is a list of the projects to be included in MINDS (115,974 total patients).

Projects in MINDS
Project Name Cases Clinical Radiology Histopathology Molecular
Foundation Medicine (FM) 18,004
The Cancer Genome Atlas (TCGA) 11,428
Therapeutically Applicable Research to Generate Effective Treatments (TARGET) 6,543
Clinical Proteomic Tumor Analysis Consortium (CPTAC) 1,656
The Molecular Profiling to Predict Response to Treatment (MP2PRT) 1,562
Multiple Myeloma Research Foundation (MMRF) 995
BEATAML1.0 882
Cancer Genome Characterization Initiatives (CGCI) 645
NCI Center for Cancer Research (NCICCR) 489
REBC 449
MATCH 448
Ukrainian National Research Center for Radiation Medicine Trio Study (TRIO) 339
Count Me In (CMI) 299
Human Cancer Model Initiative (HCMI) 278
West Coast Prostrate Cancer Dream Team (WCDT) 101
Oregon Health and Science University (OHSU) 176
Applied Proteogenomics OrganizationaL Learning and Outcomes (APOLLO) 87
EXCEPTIONAL RESPONDERS 84
Environment And Genetics in Lung Cancer Etiology (EAGLE) 50
ORGANOID 70
Clinical Trials Sequencing Project (CTSP) 45
VA Research Precision Oncology Program (VAREPOP) 7
4D-Lung 20
A091105 83
AAPM-RT-MAC 55
ACNS0332 85
ACRIN-6698 385
ACRIN-Contralateral-Breast-MR 984
ACRIN-DSC-MR-Brain 123
ACRIN-FLT-Breast 83
ACRIN-FMISO-Brain 45
ACRIN-HNSCC-FDG-PET-CT 260
ACRIN-NSCLC-FDG-PET 242
Adrenal-ACC-Ki67-Seg 53
Advanced-MRI-Breast-Lesions 632
AHEP0731 80
AHOD0831 165
AML-Cytomorphology_LMU 200
AML-Cytomorphology_MLL_Helmholtz 189
Anti-PD-1_Lung 46
Anti-PD-1_MELANOMA 47
APOLLO-5 414
ARAR0331 108
AREN0532 544
AREN0533 294
AREN0534 239
B-mode-and-CEUS-Liver 120
Bone-Marrow-Cytomorphology_MLL_Helmholtz_Fraunhofer 945
Brain-TR-GammaKnife 47
Brain-Tumor-Progression 20
Breast-Cancer-Screening-DBT 5,060
BREAST-DIAGNOSIS 88
Breast-Lesions-USG 256
Breast-MRI-NACT-Pilot 64
Burdenko-GBM-Progression 180
C-NMC 2019 118
C4KC-KiTS 210
CALGB50303 155
CBIS-DDSM 1,566
CC-Radiomics-Phantom 17
CC-Radiomics-Phantom-2 251
CC-Tumor-Heterogeneity 23
CDD-CESM 326
CMB-AML 8
CMB-CRC 49
CMB-GEC 7
CMB-LCA 61
CMB-MEL 44
CMB-MML 64
CMB-PCA 12
CMMD 1,775
CODEX imaging of HCC 15
Colorectal-Liver-Metastases 197
COVID-19-AR 105
COVID-19-NY-SBU 1,384
CRC_FFPE-CODEX_CellNeighs 35
CT COLONOGRAPHY 825
CT Images in COVID-19 661
CT Lymph Nodes 176
CT-ORG 140
CT-Phantom4Radiomics 1
CT-vs-PET-Ventilation-Imaging 20
CTpred-Sunitinib-panNET 38
DFCI-BCH-BWH-PEDs-HGG 61
DLBCL-Morphology 209
DRO-Toolkit 32
Duke-Breast-Cancer-MRI 922
EA1141 500
ExACT 30
FDG-PET-CT-Lesions 900
GammaKnife-Hippocampal 390
GBM-DSC-MRI-DRO 3
GLIS-RT 230
HCC-TACE-Seg 105
HE-vs-MPM 12
Head-Neck Cetuximab 111
Head-Neck-PET-CT 298
HEAD-NECK-RADIOMICS-HN1 137
Healthy-Total-Body-CTs 30
HER2 tumor ROIs 273
HistologyHSI-GB 13
HNC-IMRT-70-33 211
HNSCC 627
HNSCC-3DCT-RT 31
HNSCC-mIF-mIHC-comparison 8
Hungarian-Colorectal-Screening 200
ISPY1 222
ISPY2 719
IvyGAP 39
LCTSC 60
LDCT-and-Projection-data 299
LGG-1p19qDeletion 159
LIDC-IDRI 1,010
Lung Phantom 1
Lung-Fused-CT-Pathology 6
Lung-PET-CT-Dx 355
LungCT-Diagnosis 61
Meningioma-SEG-CLASS 96
MIDRC-RICORD-1A 110
MIDRC-RICORD-1B 117
MIDRC-RICORD-1C 361
MiMM_SBILab 5
NADT-Prostate 37
NaF PROSTATE 9
NLST 26,254
NRG-1308 12
NSCLC Radiogenomics 211
NSCLC-Cetuximab 490
NSCLC-Radiomics 422
NSCLC-Radiomics-Genomics 89
NSCLC-Radiomics-Interobserver1 22
OPC-Radiomics 606
Osteosarcoma-Tumor-Assessment 4
Ovarian Bevacizumab Response 78
Pancreas-CT 82
Pancreatic-CT-CBCT-SEG 40
PCa_Bx_3Dpathology 50
Pediatric-CT-SEG 359
Pelvic-Reference-Data 58
Phantom FDA 7
Post-NAT-BRCA 64
Pretreat-MetsToBrain-Masks 200
Prostate Fused-MRI-Pathology 28
Prostate-3T 64
Prostate-Anatomical-Edge-Cases 131
PROSTATE-DIAGNOSIS 92
PROSTATE-MRI 26
Prostate-MRI-US-Biopsy 1,151
PROSTATEx 346
Pseudo-PHI-DICOM-Data 21
PTRC-HGSOC 174
QIBA CT-1C 1
QIBA-CT-Liver-Phantom 3
QIN Breast DCE-MRI 10
QIN GBM Treatment Response 54
QIN LUNG CT 47
QIN PET Phantom 2
QIN PROSTATE 22
QIN-BRAIN-DSC-MRI 49
QIN-BREAST 67
QIN-BREAST-02 13
QIN-HEADNECK 279
QIN-PROSTATE-Repeatability 15
QIN-SARCOMA 15
RADCURE 3,346
REMBRANDT 130
ReMIND 114
RHUH-GBM 40
RIDER Breast MRI 5
RIDER Lung CT 32
RIDER Lung PET-CT 244
RIDER NEURO MRI 19
RIDER PHANTOM MRI 10
RIDER PHANTOM PET-CT 20
RIDER Pilot 8
S0819 1,299
SLN-Breast 78
SN-AM 60
Soft-tissue-Sarcoma 51
SPIE-AAPM Lung CT Challenge 70
StageII-Colorectal-CT 230
UCSF-PDGM 495
UPENN-GBM 630
Vestibular-Schwannoma-MC-RC 124
Vestibular-Schwannoma-SEG 242
VICTRE 2,994

Installation

Currently the cloud version of MINDS is in closed beta, but, you can still recreate the MINDS database locally. To get the local version of the MINDS database running, you will need to setup a MySQL database and populate it with the MINDS schema. This can be easily done using a docker container. First, you will need to install docker. You can find the installation instructions for your operating system here. Next, you will need to pull the MySQL docker image and run a container with the following command.

Note

Please replace my-secret-pw with your desired password and port with the port you want to use to access the database. The default port for MySQL is 3306. The following command will not work until you replace port with a valid port number.

docker run -d --name minds -e MYSQL_ROOT_PASSWORD=my-secret-pw -e MYSQL_DATABASE=minds -p port:3306 mysql

Finally, to install the MINDS python package use the following pip command:

pip install git+https://github.com/lab-rasool/MINDS.git

After installing the package, please create a .env file in the root directory of the project with the following variables:

HOST=127.0.0.1
PORT=3306
DB_USER=root
PASSWORD=my-secret-pw
DATABASE=minds   

Usage

Initial setup and automated updates

If you have locally setup the MINDS database, then you will need to populate it with data. To do this, or to update the database with the latest data, you can use the following command:

# Import the minds package
import minds

# Update the database with the latest data
minds.update()

Querying the MINDS database

The MINDS python package provides a python interface to the MINDS database. You can use this interface to query the database and return the results as a pandas dataframe.

import minds

# get a list of all the tables in the database
tables = minds.get_tables()

# get a list of all the columns in a table
columns = minds.get_columns("clinical")

# Query the database directly
query = "SELECT * FROM minds.clinical WHERE project_id = 'TCGA-LUAD' LIMIT 10"
df = minds.query(query)

Building the cohort and downloading the data

# Generate a cohort to download from query
query_cohort = minds.build_cohort(query=query, output_dir="./data")

# or you can now directly supply a cohort from GDC
gdc_cohort = minds.build_cohort(gdc_cohort="cohort_Unsaved_Cohort.2024-02-12.tsv", output_dir="./data")

# to get the cohort details
gdc_cohort.stats()

# to download the data from the cohort to the output directory specified
# you can also specify the number of threads to use and the modalities to exclude or include
gdc_cohort.download(threads=12, exclude=["Slide Image"])

Please cite our work

@Article{s24051634,
    AUTHOR = {Tripathi, Aakash and Waqas, Asim and Venkatesan, Kavya and Yilmaz, Yasin and Rasool, Ghulam},
    TITLE = {Building Flexible, Scalable, and Machine Learning-Ready Multimodal Oncology Datasets},
    JOURNAL = {Sensors},
    VOLUME = {24},
    YEAR = {2024},
    NUMBER = {5},
    ARTICLE-NUMBER = {1634},
    URL = {https://www.mdpi.com/1424-8220/24/5/1634},
    ISSN = {1424-8220},
    DOI = {10.3390/s24051634}
}

Contributing

We welcome contributions from the community. If you would like to contribute to the MINDS project, please read our contributing guidelines.

License

This project is licensed under the MIT License - see the LICENSE file for details.