Running the experiment

Run this from a virtual environment with mllabiome installed and activated. See the Installation guide if needed.

Run the IBD example from the repository root:


python example/IBD/ibd_franzosa.py

The script’s entry point calls run_experiment() directly:


if __name__ == "__main__":
    evaluator, summary = run_experiment()

Each (taxonomy, transformation) pair is preprocessed once and reused across all learners and projections that build on it. Adding more learners or projections to a sweep does not increase preprocessing time. With save_processed_data=True, each cached dataset is also persisted to disk and loaded on a resumed run, so preprocessing is skipped entirely on restart.

Assembling the configuration

run_experiment() packs all experiment constants into a single ExperimentConfiguration and passes it to run_evaluation():


def run_experiment():
    config = mll.ExperimentConfiguration(
        microbiome_file=MICROBIOME_FILE_PATH,
        metadata_file=METADATA_FILE_PATH,
        experiment_dir=EXPERIMENT_DIR,
        sample_id_column=SAMPLE_ID_COLUMN_NAME,
        taxonomic_configs=TAXONOMIC_RESOLUTIONS_CONFIGS,
        transform_configs=TRANSFORMS_CONFIGS,
        target_configs=[
            mll.TargetConfig(
                column=TARGET_COLUMN_NAME,
                task_type=TASK_TYPE,
            )
        ],
        primary_modality_models=MODEL_CONFIGS,
        multimodal_config=None,
        nested_cv_config=NESTED_CV_CONFIG,
        hyperopt_config=mll.HyperoptConfig(...),
        evaluation_thresholds=EVALUATION_THRESHOLDS,
        execution_config=mll.ExperimentExecutionConfig(...),
    )
    return run_evaluation(config)

For a parameter-by-parameter breakdown of each constant, see Experiment configuration.

Execution settings

ExperimentExecutionConfig controls parallelism, memory management, checkpointing, and disk output. The IBD example uses:


execution_config=mll.ExperimentExecutionConfig(
    n_jobs=1,
    use_threading=False,
    enable_early_termination=True,
    progress_backend="sqlite",
    progress_batch_size=10,
    consolidate_predictions=True,
    model_compression_level=3,
    consolidate_hyperparameters=True,
    deduplicate_nested_cv=False,
    incremental_results=True,
    stream_predictions=True,
    save_model_weights=True,
    save_predictions=True,
    save_processed_data=True,
    store_data_in_results=False,
    gc_between_configs=True,
    keep_original_data=False,
)

Parameter	Default	Purpose
`n_jobs`	`1`	Number of parallel workers. `1` runs sequentially. Values above `1` enable a thread or process pool.
`use_threading`	`True`	When `n_jobs > 1`, use `ThreadPoolExecutor` rather than `ProcessPoolExecutor`.
`enable_early_termination`	`True`	Skip outer evaluation for configurations that fail inner validation thresholds.
`progress_backend`	`"sqlite"`	Persistence backend for run state. `"sqlite"` writes a checkpoint database that allows an interrupted run to resume.
`incremental_results`	`True`	Flush each completed result to disk immediately rather than waiting for the full sweep to finish.
`save_model_weights`	`True`	Serialise fitted models to `weights/*.joblib`. Required for ensemble search.
`save_predictions`	`True`	Write per-fold predictions to Parquet. Required for ensemble search.
`gc_between_configs`	`True`	Run the garbage collector after each configuration. Limits memory growth in long sweeps.
`model_compression_level`	`3`	zlib level for serialised model files. Range 1–9: higher values compress more but are slower to write and read.

Running the evaluation

run_evaluation() constructs the evaluator, runs the sweep, and writes the results:


def run_evaluation(config):
    evaluator = mll.Evaluator(config)
    evaluator.run_systematic_evaluation()
    summary = evaluator.save_results()
    return evaluator, summary

run_systematic_evaluation() iterates every combination of taxonomic resolution, transformation, projection, and learner. The inner cross-validation loop identifies configurations that meet the evaluation thresholds; the outer loop estimates their generalisation performance. When incremental_results=True, each result is written to disk as it completes.

save_results() consolidates all completed results into a summary DataFrame, writes evaluation_results.csv to the experiment directory, and returns:


{
    "total_configs": <int>,       # configurations evaluated
    "qualifying_configs": <int>,  # configurations that passed thresholds
    "qualifying_targets": [str],  # targets with at least one qualifying result
}

For the full directory layout and a file-by-file description of every output, see Experiment output structure.