Skip to Content

Running the experiment

Run this from a virtual environment with mllabiome installed and activated. See the Installation guide if needed.

Run the IBD example from the repository root:

python example/IBD/ibd_franzosa.py

The script’s entry point calls run_experiment() directly:

if __name__ == "__main__": evaluator, summary = run_experiment()

Each (taxonomy, transformation) pair is preprocessed once and reused across all learners and projections that build on it. Adding more learners or projections to a sweep does not increase preprocessing time. With save_processed_data=True, each cached dataset is also persisted to disk and loaded on a resumed run, so preprocessing is skipped entirely on restart.

Assembling the configuration

run_experiment() packs all experiment constants into a single ExperimentConfiguration and passes it to run_evaluation():

def run_experiment(): config = mll.ExperimentConfiguration( microbiome_file=MICROBIOME_FILE_PATH, metadata_file=METADATA_FILE_PATH, experiment_dir=EXPERIMENT_DIR, sample_id_column=SAMPLE_ID_COLUMN_NAME, taxonomic_configs=TAXONOMIC_RESOLUTIONS_CONFIGS, transform_configs=TRANSFORMS_CONFIGS, target_configs=[ mll.TargetConfig( column=TARGET_COLUMN_NAME, task_type=TASK_TYPE, ) ], primary_modality_models=MODEL_CONFIGS, multimodal_config=None, nested_cv_config=NESTED_CV_CONFIG, hyperopt_config=mll.HyperoptConfig(...), evaluation_thresholds=EVALUATION_THRESHOLDS, execution_config=mll.ExperimentExecutionConfig(...), ) return run_evaluation(config)

For a parameter-by-parameter breakdown of each constant, see Experiment configuration.

Execution settings

ExperimentExecutionConfig controls parallelism, memory management, checkpointing, and disk output. The IBD example uses:

execution_config=mll.ExperimentExecutionConfig( n_jobs=1, use_threading=False, enable_early_termination=True, progress_backend="sqlite", progress_batch_size=10, consolidate_predictions=True, model_compression_level=3, consolidate_hyperparameters=True, deduplicate_nested_cv=False, incremental_results=True, stream_predictions=True, save_model_weights=True, save_predictions=True, save_processed_data=True, store_data_in_results=False, gc_between_configs=True, keep_original_data=False, )
ParameterDefaultPurpose
n_jobs1Number of parallel workers. 1 runs sequentially. Values above 1 enable a thread or process pool.
use_threadingTrueWhen n_jobs > 1, use ThreadPoolExecutor rather than ProcessPoolExecutor.
enable_early_terminationTrueSkip outer evaluation for configurations that fail inner validation thresholds.
progress_backend"sqlite"Persistence backend for run state. "sqlite" writes a checkpoint database that allows an interrupted run to resume.
incremental_resultsTrueFlush each completed result to disk immediately rather than waiting for the full sweep to finish.
save_model_weightsTrueSerialise fitted models to weights/*.joblib. Required for ensemble search.
save_predictionsTrueWrite per-fold predictions to Parquet. Required for ensemble search.
gc_between_configsTrueRun the garbage collector after each configuration. Limits memory growth in long sweeps.
model_compression_level3zlib level for serialised model files. Range 1–9: higher values compress more but are slower to write and read.

Running the evaluation

run_evaluation() constructs the evaluator, runs the sweep, and writes the results:

def run_evaluation(config): evaluator = mll.Evaluator(config) evaluator.run_systematic_evaluation() summary = evaluator.save_results() return evaluator, summary

run_systematic_evaluation() iterates every combination of taxonomic resolution, transformation, projection, and learner. The inner cross-validation loop identifies configurations that meet the evaluation thresholds; the outer loop estimates their generalisation performance. When incremental_results=True, each result is written to disk as it completes.

save_results() consolidates all completed results into a summary DataFrame, writes evaluation_results.csv to the experiment directory, and returns:

{ "total_configs": <int>, # configurations evaluated "qualifying_configs": <int>, # configurations that passed thresholds "qualifying_targets": [str], # targets with at least one qualifying result }

For the full directory layout and a file-by-file description of every output, see Experiment output structure.

Last updated on