Running the experiment
Run this from a virtual environment with mllabiome installed and activated. See the Installation guide if needed.
Run the metabolomics example from the repository root:
python example/IBD/ibd_franzosa_mtb.pyThe script’s entry point calls run_experiment() directly:
if __name__ == "__main__":
evaluator, summary = run_experiment()Assembling the configuration
run_experiment() packs all experiment constants into a single ExperimentConfiguration and passes it to run_evaluation():
def run_experiment():
config = mll.ExperimentConfiguration(
primary_data_file=METABOLOMICS_FILE_PATH,
metadata_file=METADATA_FILE_PATH,
experiment_dir=EXPERIMENT_DIR,
sample_id_column=SAMPLE_ID_COLUMN_NAME,
features_are_rows=False,
taxonomic_configs=[],
transform_configs=TRANSFORMS_CONFIGS,
target_configs=[
mll.TargetConfig(
column=TARGET_COLUMN_NAME,
task_type=TASK_TYPE,
)
],
primary_modality_models=MODEL_CONFIGS,
multimodal_config=None,
nested_cv_config=NESTED_CV_CONFIG,
hyperopt_config=mll.HyperoptConfig(enabled=False, ...),
evaluation_thresholds=EVALUATION_THRESHOLDS,
execution_config=mll.ExperimentExecutionConfig(...),
)
return run_evaluation(config)For a parameter-by-parameter breakdown, see Experiment configuration.
Execution settings
ExperimentExecutionConfig controls parallelism, memory management, checkpointing, and disk output. The metabolomics example uses:
execution_config=mll.ExperimentExecutionConfig(
n_jobs=1,
use_threading=False,
enable_early_termination=True,
progress_backend="sqlite",
progress_batch_size=10,
consolidate_predictions=True,
model_compression_level=3,
consolidate_hyperparameters=True,
deduplicate_nested_cv=False,
incremental_results=True,
stream_predictions=True,
save_model_weights=True,
save_predictions=True,
save_processed_data=False,
store_data_in_results=False,
gc_between_configs=True,
keep_original_data=False,
)| Parameter | Default | Purpose |
|---|---|---|
n_jobs | 1 | Number of parallel workers. 1 runs sequentially. |
enable_early_termination | True | Skip outer evaluation for configurations that fail inner validation thresholds. |
progress_backend | "sqlite" | Persistence backend for run state. Allows an interrupted run to resume. |
incremental_results | True | Flush each completed result to disk immediately. |
save_model_weights | True | Serialise fitted models. Required for ensemble search. |
save_predictions | True | Write per-fold predictions to Parquet. Required for ensemble search. |
gc_between_configs | True | Run the garbage collector after each configuration to limit memory growth. |
Running the evaluation
run_evaluation() constructs the evaluator, runs the sweep, and writes the results:
def run_evaluation(config):
evaluator = mll.Evaluator(config)
evaluator.run_systematic_evaluation()
summary = evaluator.save_results()
return evaluator, summaryrun_systematic_evaluation() iterates every combination of transformation and learner. With taxonomic_configs=[], there is no taxonomic resolution axis, so the sweep covers only (transform, model) pairs. The inner cross-validation loop identifies configurations that meet the evaluation thresholds. The outer loop estimates generalisation performance.
save_results() consolidates all completed results into a summary DataFrame, writes evaluation_results.csv to the experiment directory, and returns:
{
"total_configs": <int>, # configurations evaluated
"qualifying_configs": <int>, # configurations that passed thresholds
"qualifying_targets": [str], # targets with at least one qualifying result
}