Federated learning

mllabiome includes a federated learning system that allows model training across multiple sites without centralising raw data. The coordinator runs on a central server. Each participating client keeps its data locally and contributes to the global model through parameter aggregation.

The federated layer is built on top of the Flower framework and supports XGBoost and general scikit-learn compatible models.

Architecture

Training workflow

1. Client registration

Each client site registers with the coordinator, providing a summary of its local data (number of samples, features, class distribution, sparsity):

Method	Path	Description
`POST`	`/api/clients/register`	Register a client with data summary
`GET`	`/api/clients`	List registered clients
`POST`	`/api/clients/{client_id}/heartbeat`	Signal that the client is alive

An optional whitelist (expected_client_ids in the server configuration) restricts which client IDs are accepted.

2. Local data analysis

Each client loads its profile and metadata files locally in the browser. The frontend computes quality metrics (sparsity, class distribution, feature coverage) from the parsed file content and sends only those summary statistics to the server during registration. Raw data never leaves the client machine.

3. Feature alignment

Before training starts, the coordinator analyses the features available across all registered clients:

Method	Path	Description
`GET`	`/api/training/analyze`	Returns union and intersection of client features, common metadata columns

The training request specifies a feature_strategy ("intersection" or "union") to determine the shared feature set.

Training sessions follow a consent-based protocol. The coordinator creates a proposal containing the model type, target column, number of rounds, and a code_preview showing the exact code that each client will execute:

Method	Path	Description
`POST`	`/api/training/start`	Create a training proposal
`POST`	`/api/training/sessions/{id}/consent`	Client submits consent (accept or reject)
`GET`	`/api/training/sessions`	List all training sessions

A session enters the awaiting_consent state. Each client reviews the code preview and submits a consent decision. When auto_start=true in the session configuration, training begins automatically once all expected clients have consented.

5. Training execution

When all consents are collected, the coordinator launches a Flower server. Clients connect and participate in the specified number of federated rounds:

Method	Path	Description
`POST`	`/api/training/sessions/{id}/execute`	Manually start training
`GET`	`/api/analytics/session/{id}`	Session metrics, convergence status

Each round:

The server distributes the current global model parameters.
Each client trains locally on its own data.
Clients send updated parameters back to the server.
The server aggregates updates (FedAvg) into a new global model.

6. Analytics

The analytics engine tracks training progress:

Metric	Description
Feature importance	Aggregated across rounds
Training curves	Per-round loss and accuracy
Client performance	Per-client contribution metrics
Convergence status	Whether the global model has stabilised

Session configuration


{
    "model_name": "xgboost",          # "xgboost" or "random_forest"
    "task_type": "binary",
    "num_rounds": 10,
    "target_column": "disease",
    "feature_strategy": "intersection", # "intersection" or "union"
    "sample_id_column": "Sample",
    "federated_strategy": "model_aggregation",
    "auto_start": True,
}

Parameter	Description
`model_name`	Model type to train across clients.
`num_rounds`	Number of federated aggregation rounds.
`target_column`	The target variable column name in each client’s metadata.
`feature_strategy`	`"intersection"` uses only features present at all sites. `"union"` includes all features (missing features are zero-filled).
`auto_start`	When `true`, training starts automatically once every client has consented.

Session states

State	Meaning
`pending`	Session created, not yet proposed to clients
`awaiting_consent`	Proposal sent, waiting for all client consents
`running`	Flower server active, training in progress
`completed`	All rounds finished
`failed`	An error occurred during training
`rejected`	One or more clients declined the proposal

Persistent state

The coordinator persists its state to federated_state.json in the backend directory. This file contains the client registry, active sessions, and completed session history. The state is loaded on server startup, so restarting the backend does not lose registered clients or session history.