Post-explain: global

run_post_explanation computes feature importance explanations for a trained model using SHAP, LIME, permutation importance, and ALE, then aggregates them across all CV folds and writes a report.

Global feature importance across XAI methods — IBD example

The chart above shows the top features ranked by each XAI method for a Random Forest trained on the IBD Franzosa 2019 cohort. Stars mark features with cross-method consensus: an orange star indicates strong consensus across 3 or more methods, a green star indicates agreement across 2 methods.

XAI dependencies are not installed with the base package. Install them first:


uv pip install -e ".[xai]"


from mllabiome.xai_space.postexplain import run_post_explanation
 
explainer = run_post_explanation(
    "results/ibd_franzosa/siso/target-Study.Group/taxonomy-fel_genus_excl_chlo/transform-none/project-none/models/RandomForestClassifier_min_samples_leaf-5_n_estimators-1000_random_state-91",
    features_file="example/IBD/data/FRANZOSA_IBD_2019_profiles_hierarchical.tsv",
    targets_file="example/IBD/data/metadata.tsv",
    target_column="Study.Group",
    quick_run=False,
    max_samples=None,
)

Parameters

Parameter	Default	Description
`directory`	required	Path to a trained model directory or an ensemble results directory.
`features_file`	required	Path to the microbiome profiles TSV.
`targets_file`	required	Path to the metadata TSV.
`target_column`	required	Column in `targets_file` to use as the prediction target.
`methods`	`["shap", "lime", "permutation", "ale"]`	XAI methods to run. Pass a subset to skip specific methods.
`top_n`	`20`	Number of top features to include in the summary report and visualisations.
`visualize`	`True`	Generate PNG visualisations alongside the reports.
`quick_run`	`False`	Process only the first CV fold per model. Useful for a fast smoke test.
`max_samples`	`None`	Maximum number of test samples used for SHAP and LIME. `None` uses all test samples. Does not affect permutation importance or ALE.
`top_n_prescreen`	`None`	For permutation importance only: pre-screen to this many features using the model’s built-in `feature_importances_` before running the full permutation loop. Reduces runtime on large feature spaces without discarding the full feature matrix passed to the model.
`verbose`	`True`	Print progress to stdout.

What is written to disk

All output is written under {model_dir}/explainability/.

File	Description
`ensemble_feature_importance_report.parquet`	Aggregated feature importance scores across all folds and all methods, one row per feature per method.
`ensemble_top_features.txt`	Top-N features per method, as plain text.
`feature_interactions.csv`	ALE second-order interaction scores for feature pairs. Only written when ALE runs successfully.
`global_feature_importance.png`	Bar chart grid showing the top features per XAI method (example above).
`feature_importance_heatmap.png`	Normalised importance heatmap across methods and top features.
`consensus_feature_importance.png`	Bar chart highlighting features with cross-method consensus.

Per-fold incremental results are cached under explainability/incremental/ during the run. If a run is interrupted it will resume from the last completed fold rather than starting over.