Run SIMS in the browser using h5wasm to read local AnnData (.h5ad) files and ONNX to run the model.
Opens an h5ad in the browser and runs a selected SIMs model and displays predictions.
You can view the default ONNX model via netron
NOTE: This application has not been fully verified as concordant to the python SIMS yet. Currently the predictions are ~90% concordant with SIMS
The front end a single page React web app using Material UI and Vite with no back end - just file storage and an HTTP server is required. The python pieces all relate to converting pytorch models into ONNX and then editing the ONNX graph to move as much of predictions processing into the graph as possible (i.e. LpNorm and SoftMax of probabilities) as well as to expose internal nodes such as the encoder output for clustering and the attention masks for explainability.
Install dependencies for the python model exporting and webapp:
pip install -r requirements.txt
npm install
Export a SIMS checkpoint to an ONNX file and list of genes:
python scripts/sims-to-onnx.py checkpoints/default.ckpt public/models/
Check a model for compatibility with ONNX:
python -m onnxruntime.tools.check_onnx_model_mobile_usability public/models/default.onnx
Compare the output of SIMS to ONNX using the python runtime:
python scripts/validate.py checkpoints/default.ckpt public/models/default.onnx public/sample.h5ad --decimals 2
Serve the web app and exported models locally with auto-reload courtesy of vite:
npm run dev
Display the compute graph using netron:
netron public/models/default.onnx
worker.js uses h5wasm slice() to read data from the cell by gene matrix (i.e. X). As these data on disk are typically stored row major (i.e. all data for a cell is contiguous) we can process the sample incrementally keeping memory requirements to a minimum. Reading cell by cell from a 5.3G h5ad file consumed just under 30M of browser memory. YMMV.
ONNX supports multithreaded inference. We allocate total cores - 2 for inference. This leaves 1 thread for the main loop so the UI can remain responsible and 1 thread for ONNX to coordinate via its 'proxy' setting (see worker.js for details).
Predicting 8796 cells on a MacBook M3 Pro took 1.17 minutes or ~100k cells in minutes.
ONNX Web Runtime does have support for GPUs, but unfortunately they don't support all operators yet. Specifically TopK, LpNormalization and GatherElements are not supported. See sclblonnx.check(graph) for details.
Open Neural Network Exchange (ONNX)
ONNX Runtime Web (WASM Backend)
ONNX Runtime Web Platform Functionality Details
ONNX Runtime Javascript Examples
Alternative Web ONNX Runtime in Rust
Netron ONNX Graph Display Website
Classify images in a web application with ONNX Runtime Web
anndata/h5ad file structure and on disk format
TabNet Model for attentive tabular learning
Semi supervised pre training with TabNet
Classification of Alzheimer's disease using robust TabNet neural networks on genetic data
Designing interpretable deep learning applications for functional genomics: a quantitative analysis
Assessing GPT-4 for cell type annotation in single-cell RNA-seq analysis