Running the experiment
Run this from a virtual environment with mllabiome installed and activated. See the Installation guide if needed.
Run the IBD example from the repository root:
python example/IBD/ibd_franzosa.pyThe script’s entry point calls run_experiment() directly:
if __name__ == "__main__":
evaluator, summary = run_experiment()Each (taxonomy, transformation) pair is preprocessed once and reused across all learners and projections that build on it. Adding more learners or projections to a sweep does not increase preprocessing time. With save_processed_data=True, each cached dataset is also persisted to disk and loaded on a resumed run, so preprocessing is skipped entirely on restart.
Assembling the configuration
run_experiment() packs all experiment constants into a single ExperimentConfiguration and passes it to run_evaluation():
def run_experiment():
config = mll.ExperimentConfiguration(
microbiome_file=MICROBIOME_FILE_PATH,
metadata_file=METADATA_FILE_PATH,
experiment_dir=EXPERIMENT_DIR,
sample_id_column=SAMPLE_ID_COLUMN_NAME,
taxonomic_configs=TAXONOMIC_RESOLUTIONS_CONFIGS,
transform_configs=TRANSFORMS_CONFIGS,
target_configs=[
mll.TargetConfig(
column=TARGET_COLUMN_NAME,
task_type=TASK_TYPE,
)
],
primary_modality_models=MODEL_CONFIGS,
multimodal_config=None,
nested_cv_config=NESTED_CV_CONFIG,
hyperopt_config=mll.HyperoptConfig(...),
evaluation_thresholds=EVALUATION_THRESHOLDS,
execution_config=mll.ExperimentExecutionConfig(...),
)
return run_evaluation(config)For a parameter-by-parameter breakdown of each constant, see Experiment configuration.
Execution settings
ExperimentExecutionConfig controls parallelism, memory management, checkpointing, and disk output. The IBD example uses:
execution_config=mll.ExperimentExecutionConfig(
n_jobs=1,
use_threading=False,
enable_early_termination=True,
progress_backend="sqlite",
progress_batch_size=10,
consolidate_predictions=True,
model_compression_level=3,
consolidate_hyperparameters=True,
deduplicate_nested_cv=False,
incremental_results=True,
stream_predictions=True,
save_model_weights=True,
save_predictions=True,
save_processed_data=True,
store_data_in_results=False,
gc_between_configs=True,
keep_original_data=False,
)| Parameter | Default | Purpose |
|---|---|---|
n_jobs | 1 | Number of parallel workers. 1 runs sequentially. Values above 1 enable a thread or process pool. |
use_threading | True | When n_jobs > 1, use ThreadPoolExecutor rather than ProcessPoolExecutor. |
enable_early_termination | True | Skip outer evaluation for configurations that fail inner validation thresholds. |
progress_backend | "sqlite" | Persistence backend for run state. "sqlite" writes a checkpoint database that allows an interrupted run to resume. |
incremental_results | True | Flush each completed result to disk immediately rather than waiting for the full sweep to finish. |
save_model_weights | True | Serialise fitted models to weights/*.joblib. Required for ensemble search. |
save_predictions | True | Write per-fold predictions to Parquet. Required for ensemble search. |
gc_between_configs | True | Run the garbage collector after each configuration. Limits memory growth in long sweeps. |
model_compression_level | 3 | zlib level for serialised model files. Range 1–9: higher values compress more but are slower to write and read. |
Running the evaluation
run_evaluation() constructs the evaluator, runs the sweep, and writes the results:
def run_evaluation(config):
evaluator = mll.Evaluator(config)
evaluator.run_systematic_evaluation()
summary = evaluator.save_results()
return evaluator, summaryrun_systematic_evaluation() iterates every combination of taxonomic resolution, transformation, projection, and learner. The inner cross-validation loop identifies configurations that meet the evaluation thresholds; the outer loop estimates their generalisation performance. When incremental_results=True, each result is written to disk as it completes.
save_results() consolidates all completed results into a summary DataFrame, writes evaluation_results.csv to the experiment directory, and returns:
{
"total_configs": <int>, # configurations evaluated
"qualifying_configs": <int>, # configurations that passed thresholds
"qualifying_targets": [str], # targets with at least one qualifying result
}For the full directory layout and a file-by-file description of every output, see Experiment output structure.