Resuming and extending an experiment

Resuming after an interruption

When progress_backend="sqlite" and incremental_results=True are set (both are the case in the IBD example), the framework writes a progress database to experiment_progress.db inside the target directory after every completed configuration. If the run is interrupted for any reason — an out-of-memory kill, a crash, or Ctrl+C — re-running the same script picks up from the last completed configuration automatically:


python example/IBD/ibd_franzosa.py

On startup the framework reports how many configurations are already completed and how many remain pending. Completed configurations are not re-evaluated.

Configurations that were marked as failed (an unhandled exception during evaluation) are also skipped on a plain re-run. To re-run a failed configuration, see the section below.

Adding new configurations

New entries can be appended to any of the four search-space lists at any time. On the next run, only the new entries are evaluated; all previously completed configurations are recognised by their identifier and skipped.

For example, to add a new taxonomic resolution and a new learner to the IBD script:


TAXONOMIC_RESOLUTIONS_CONFIGS = [
    # ... existing entries ...
    mll.TaxonomicProcessingConfig.filter_exact(
        level=mll.TaxonomicLevel.SPECIES,
    ),
]
 
MODEL_CONFIGS = [
    # ... existing entries ...
    mll.LogisticRegression(C=1.0),
]

Re-running the script evaluates all combinations involving the new entries while leaving existing results untouched.

Re-running a failed configuration

Each configuration is identified by a string derived from the target name, taxonomic resolution, transformation, projection, and the learner name together with its explicitly-set constructor arguments. Changing any of those arguments produces a different identifier, which the framework treats as a new, unevaluated configuration.

The most direct way to force a re-run of a configuration that previously failed is to add or change a parameter that becomes part of the identifier. Adding random_state to a learner that did not have it set is a common choice:


# Original — has been marked failed, will be skipped
mll.XGBoost(n_estimators=1000),
 
# New entry — different identifier, will be evaluated
mll.XGBoost(n_estimators=1000, random_state=42),

Both entries remain in the list. The original is skipped; the new one runs as a fresh configuration with its own output directory.

Any other explicitly-set argument works the same way. The argument does not need to affect the model meaningfully — its only role here is to distinguish the identifier. That said, random_state has the practical benefit of making the new run deterministic and directly comparable to others.

Starting fresh

To discard all progress and re-run the entire experiment from the beginning, delete the progress database:


rm results/ibd_franzosa/siso/target-Study.Group/experiment_progress.db

The database is recreated automatically on the next run.