Overview
This tutorial walks through taking a trained ensemble from the IBD experiment and deploying it as a REST API. The starting point is the results/ibd_franzosa/ directory produced by the single-modality microbiome tutorial.
Prerequisites
- A completed experiment with an
ensemble/ensemble_summary.jsonfile (produced bysweep_ibd.py). - The original training data (
FRANZOSA_IBD_2019_profiles_hierarchical.tsvandmetadata.tsv). - The app extras installed:
uv pip install -e ".[app]"What happens during deployment
During cross-validation, each model is trained on a subset of the data. Deploying an ensemble means retraining every member on the full training set, saving the resulting artefacts, and placing them where the backend can discover them.
The pipeline:
- Load the ensemble summary.
ProductionPipelinereadsensemble_summary.jsonto identify the selected models and aggregation strategy. - Retrain on full data. Each model is retrained with its original preprocessing (taxonomic filtering and compositional transformation) but using all 220 samples instead of a single CV fold.
- Save artefacts. Serialised models, ensemble configuration, and per-model pipeline settings are written to a directory.
- Serve via the REST API. The backend loads the artefacts and exposes
/api/inference/predict,/api/inference/predict/batch, and/api/inference/explainendpoints.
Ensemble used in this tutorial
The IBD experiment produced a 5-model Copeland ensemble with an inner-validation HALO score of 0.685:
| Model | Taxonomy | Transform |
|---|---|---|
| XGBoost | genus (excl. Chloroplast) | none |
| Random forest | genus (excl. Chloroplast) | relative + arcsin-sqrt |
| XGBoost | genus | binary |
| XGBoost | order-family (no aggregation) | none |
| XGBoost | genus | relative + arcsin-sqrt |
Tutorial structure
| Page | Content |
|---|---|
| Packaging models | Retrain the ensemble on full data and save the artefacts. |
| Serving predictions | Start the backend and query the inference API. |
Complete script
The full deployment script is at example/IBD/deploy_ibd.py.
Last updated on