Experiment output structure
All outputs are written inside EXPERIMENT_DIR. The directory is created automatically if it does not exist. After a completed run the layout looks like this:
results/ibd_franzosa/
├── evaluation_results.csv
└── siso/
└── target-Study.Group/
├── cv_config.json
├── experiment_progress.db
├── fold_diversity_stats.json
└── taxonomy-{abbrev}/
└── transform-{type}/
├── transform_metadata.json
└── project-{type}/
├── config.json
├── processed_data.tsv
├── targets.parquet
└── models/
└── {ModelName}/
├── mpdr.yaml
├── mpma.yaml
├── learner_config.json
├── inner_fold_performance_summary.parquet
├── outer_test_folds_performance_summary.parquet
├── outer_train_folds_performance_summary.parquet
├── predictions/
│ ├── outer_test_predictions.parquet
│ ├── outer_train_predictions.parquet
│ ├── inner_val_predictions.parquet
│ └── inner_train_predictions.parquet
└── weights/
├── model_R0_OF0.joblib
└── ...The siso/ directory holds single-input/single-output results. Results are grouped first by target, then by taxonomic abbreviation, transformation, and projection. Each {ModelName}/ directory contains the full record for one evaluated combination.
| File | Contents |
|---|---|
evaluation_results.csv | All evaluated configurations with test performance scores, standard deviations, per-fold MCC values, and whether each configuration met the evaluation thresholds. Sorted by test performance score descending. |
cv_config.json | Cross-validation fold indices used throughout the experiment, saved once per target. |
experiment_progress.db | SQLite database tracking which configurations have completed, used to resume interrupted runs. |
fold_diversity_stats.json | Label distribution statistics for each fold. |
config.json | Configuration snapshot for this taxonomic, transformation, and projection combination. |
processed_data.tsv | Feature matrix after taxonomic filtering and transformation. |
mpdr.yaml | MPDR configuration (taxonomic resolution, transformation type, projection) for this model. |
mpma.yaml | Full MPMA configuration including the base learner name and hyperparameters. |
inner_fold_performance_summary.parquet | Per-fold inner validation scores used for model selection. |
outer_test_folds_performance_summary.parquet | Per-fold outer test scores representing the generalisation estimate. |
outer_test_predictions.parquet | Outer test fold predictions. Required for post-experiment ensemble search. |
inner_val_predictions.parquet | Inner validation predictions. Required for post-experiment ensemble search. |
model_R{r}_OF{f}.joblib | Serialised model fitted on repeat r, outer fold f. Written when save_model_weights=True. |
Last updated on