Model Calibration¶

Calibration is the process of adjusting model parameters to match observed data. EpiModel provides a powerful calibration framework that uses optimization algorithms to find the best parameter values that minimize the difference between model predictions and real-world observations.

Overview¶

The calibration process involves:

Defining observed data: Real-world measurements at specific time steps
Selecting parameters to calibrate: Which model parameters to optimize
Choosing a loss function: How to measure fit quality (SSE, RMSE, MAE, etc.)
Selecting an optimization algorithm: Nelder-Mead or Particle Swarm
Running the calibration: Finding optimal parameter values
Creating a new model: Rebuilding your model with the calibrated parameters

Calibrator Returns Parameter Values Only

The Calibrator.run() method returns a CalibrationResult object containing the optimized parameter values, but does not return or modify your model. To use the calibrated parameters, you must create a new model using ModelBuilder with the fitted values.

Basic Example¶

Here's a simple calibration of an SIR model's transmission and recovery rates:

from commol import (
    ModelBuilder,
    Simulation,
    Calibrator,
    CalibrationProblem,
    CalibrationParameter,
    ObservedDataPoint,
    LossConfig,
    LossFunction,
    OptimizationConfig,
    OptimizationAlgorithm,
    ParticleSwarmConfig,
)
from commol.constants import ModelTypes

# Build the model
model = (
    ModelBuilder(name="SIR Model", version="1.0")
    .add_bin(id="S", name="Susceptible")
    .add_bin(id="I", name="Infected")
    .add_bin(id="R", name="Recovered")
    .add_parameter(id="beta", value=0.3)   # Initial guess
    .add_parameter(id="gamma", value=0.1)  # Initial guess
    .add_transition(
        id="infection",
        source=["S"],
        target=["I"],
        rate="beta * S * I / N"
    )
    .add_transition(
        id="recovery",
        source=["I"],
        target=["R"],
        rate="gamma * I"
    )
    .set_initial_conditions(
        population_size=1000,
        bin_fractions=[
            {"bin": "S", "fraction": 0.99},
            {"bin": "I", "fraction": 0.01},
            {"bin": "R", "fraction": 0.0}
        ]
    )
    .build(typology=ModelTypes.DIFFERENCE_EQUATIONS)
)

# Define observed data (e.g., from real outbreak data)
observed_data = [
    ObservedDataPoint(step=0, compartment="I", value=10.0),
    ObservedDataPoint(step=10, compartment="I", value=45.2),
    ObservedDataPoint(step=20, compartment="I", value=78.5),
    ObservedDataPoint(step=30, compartment="I", value=62.3),
    ObservedDataPoint(step=40, compartment="I", value=38.1),
    ObservedDataPoint(step=50, compartment="I", value=18.7),
]

# Define parameters to calibrate with bounds
parameters = [
    CalibrationParameter(
        id="beta",
        min_bound=0.0,
        max_bound=1.0,
        initial_guess=0.3  # Optional starting point
    ),
    CalibrationParameter(
        id="gamma",
        min_bound=0.0,
        max_bound=1.0,
        initial_guess=0.1
    ),
]

# Configure the calibration problem
problem = CalibrationProblem(
    observed_data=observed_data,
    parameters=parameters,
    loss_config=LossConfig(function=LossFunction.SSE),
    optimization_config=OptimizationConfig(
        algorithm=OptimizationAlgorithm.PARTICLE_SWARM,
        config=ParticleSwarmConfig(
            num_particles=30,
            max_iterations=500,
            verbose=True  # Show optimization progress
        ),
    ),
)

# Run calibration
simulation = Simulation(model)
calibrator = Calibrator(simulation, problem)
result = calibrator.run()

# Display results
print(result)
print(f"Calibrated beta: {result.best_parameters['beta']:.4f}")
print(f"Calibrated gamma: {result.best_parameters['gamma']:.4f}")

Loss Functions¶

EpiModel supports multiple loss functions for measuring fit quality:

Sum of Squared Errors (SSE)¶

Default loss function. Minimizes the sum of squared differences:

loss_config = LossConfig(function=LossFunction.SSE)

$$\text{SSE} = \sum_{i=1}^{n} (y_i - \hat{y}_i)^2$$

Root Mean Squared Error (RMSE)¶

Provides error in the same units as the data:

loss_config = LossConfig(function=LossFunction.RMSE)

$$\text{RMSE} = \sqrt{\frac{1}{n}\sum_{i=1}^{n} (y_i - \hat{y}_i)^2}$$

Mean Absolute Error (MAE)¶

Less sensitive to outliers:

loss_config = LossConfig(function=LossFunction.MAE)

$$\text{MAE} = \frac{1}{n}\sum_{i=1}^{n} |y_i - \hat{y}_i|$$

Weighted SSE¶

Allows different importance for different observations:

# Give more weight to early observations
observed_data = [
    ObservedDataPoint(step=0, compartment="I", value=10.0, weight=2.0),
    ObservedDataPoint(step=10, compartment="I", value=45.2, weight=1.5),
    ObservedDataPoint(step=20, compartment="I", value=78.5, weight=1.0),
]

loss_config = LossConfig(function=LossFunction.WEIGHTED_SSE)

$$\text{WSSE} = \sum_{i=1}^{n} w_i (y_i - \hat{y}_i)^2$$

Optimization Algorithms¶

Particle Swarm Optimization (PSO)¶

Population-based algorithm inspired by social behavior. Good for global optimization:

from commol import ParticleSwarmConfig, OptimizationConfig, OptimizationAlgorithm

optimization_config = OptimizationConfig(
    algorithm=OptimizationAlgorithm.PARTICLE_SWARM,
    config=ParticleSwarmConfig(
        num_particles=40,        # Number of particles in swarm
        max_iterations=1000,     # Maximum iterations
        target_cost=0.01,        # Stop if loss below this (optional)
        inertia_factor=0.7,      # Velocity inertia (optional)
        cognitive_factor=1.5,    # Attraction to personal best (optional)
        social_factor=1.5,       # Attraction to global best (optional)
        verbose=True,            # Print progress
        header_interval=50       # Print header every N iterations
    ),
)

When to use:

Multiple local minima expected
Robust global search needed
Multiple parameters to calibrate

Nelder-Mead Simplex¶

Derivative-free simplex algorithm. Fast convergence for smooth problems:

from commol import NelderMeadConfig, OptimizationConfig, OptimizationAlgorithm

optimization_config = OptimizationConfig(
    algorithm=OptimizationAlgorithm.NELDER_MEAD,
    config=NelderMeadConfig(
        max_iterations=1000,     # Maximum iterations
        sd_tolerance=1e-6,       # Convergence tolerance
        alpha=1.0,               # Reflection coefficient (optional)
        gamma=2.0,               # Expansion coefficient (optional)
        rho=0.5,                 # Contraction coefficient (optional)
        sigma=0.5,               # Shrink coefficient (optional)
        verbose=True,            # Print progress
        header_interval=50       # Print header every N iterations
    ),
)

When to use:

Smooth, unimodal objective function
Few parameters (< 10)
Fast convergence desired

Advanced Examples¶

Multi-Compartment Calibration¶

Calibrate against observations from multiple compartments:

# Observed data from both infected and recovered compartments
observed_data = [
    # Infected observations
    ObservedDataPoint(step=10, compartment="I", value=45.2),
    ObservedDataPoint(step=20, compartment="I", value=78.5),
    ObservedDataPoint(step=30, compartment="I", value=62.3),
    # Recovered observations
    ObservedDataPoint(step=10, compartment="R", value=12.5),
    ObservedDataPoint(step=20, compartment="R", value=35.8),
    ObservedDataPoint(step=30, compartment="R", value=68.2),
]

problem = CalibrationProblem(
    observed_data=observed_data,
    parameters=[
        CalibrationParameter(id="beta", min_bound=0.0, max_bound=1.0),
        CalibrationParameter(id="gamma", min_bound=0.0, max_bound=1.0),
    ],
    loss_config=LossConfig(function=LossFunction.RMSE),
    optimization_config=OptimizationConfig(
        algorithm=OptimizationAlgorithm.PARTICLE_SWARM,
        config=ParticleSwarmConfig(max_iterations=500, verbose=False),
    ),
)

Calibrating Stratified Models¶

Calibrate parameters specific to age groups or other stratifications:

# Build age-stratified model
model = (
    ModelBuilder(name="Age-Stratified SIR", version="1.0")
    .add_bin(id="S", name="Susceptible")
    .add_bin(id="I", name="Infected")
    .add_bin(id="R", name="Recovered")
    .add_stratification(id="age", categories=["young", "old"])
    .add_parameter(id="beta", value=0.3)
    .add_parameter(id="gamma_young", value=0.12)
    .add_parameter(id="gamma_old", value=0.08)
    .add_transition(
        id="infection",
        source=["S"],
        target=["I"],
        rate="beta * S * I / N"
    )
    .add_transition(
        id="recovery",
        source=["I"],
        target=["R"],
        stratified_rates=[
            {
                "conditions": [{"stratification": "age", "category": "young"}],
                "rate": "gamma_young * I"
            },
            {
                "conditions": [{"stratification": "age", "category": "old"}],
                "rate": "gamma_old * I"
            },
        ]
    )
    .set_initial_conditions(
        population_size=1000,
        bin_fractions=[
            {"bin": "S", "fraction": 0.99},
            {"bin": "I", "fraction": 0.01},
            {"bin": "R", "fraction": 0.0}
        ],
        stratification_fractions=[
            {
                "stratification": "age",
                "fractions": [
                    {"category": "young", "fraction": 0.6},
                    {"category": "old", "fraction": 0.4}
                ]
            }
        ]
    )
    .build(typology=ModelTypes.DIFFERENCE_EQUATIONS)
)

# Calibrate stratification-specific parameters
parameters = [
    CalibrationParameter(id="beta", min_bound=0.0, max_bound=1.0),
    CalibrationParameter(id="gamma_young", min_bound=0.05, max_bound=0.2),
    CalibrationParameter(id="gamma_old", min_bound=0.05, max_bound=0.2),
]

# Note: Observed data uses stratified compartment names like "I_young", "I_old"
observed_data = [
    ObservedDataPoint(step=10, compartment="I_young", value=30.5),
    ObservedDataPoint(step=10, compartment="I_old", value=15.2),
    ObservedDataPoint(step=20, compartment="I_young", value=52.3),
    ObservedDataPoint(step=20, compartment="I_old", value=26.1),
]

Using Calibration Results¶

Once calibrated, you need to create a new model with the fitted parameters to run predictions:

# Run calibration
result = Calibrator(simulation, problem).run()

# Check if calibration converged
if result.converged:
    print(f"Calibration converged after {result.iterations} iterations")
    print(f"Final loss: {result.final_loss:.6f}")
    print(f"Calibrated parameters: {result.best_parameters}")

    # Create a new model with the calibrated parameters
    # You need to rebuild the model with the new parameter values
    calibrated_model = (
        ModelBuilder(name="Calibrated SIR", version="1.0")
        .add_bin(id="S", name="Susceptible")
        .add_bin(id="I", name="Infected")
        .add_bin(id="R", name="Recovered")
        # Use calibrated parameter values
        .add_parameter(id="beta", value=result.best_parameters["beta"])
        .add_parameter(id="gamma", value=result.best_parameters["gamma"])
        .add_transition(
            id="infection",
            source=["S"],
            target=["I"],
            rate="beta * S * I / N"
        )
        .add_transition(
            id="recovery",
            source=["I"],
            target=["R"],
            rate="gamma * I"
        )
        .set_initial_conditions(
            population_size=1000,
            bin_fractions=[
                {"bin": "S", "fraction": 0.99},
                {"bin": "I", "fraction": 0.01},
                {"bin": "R", "fraction": 0.0}
            ]
        )
        .build(typology=ModelTypes.DIFFERENCE_EQUATIONS)
    )

    # Run predictions with calibrated model
    calibrated_simulation = Simulation(calibrated_model)
    prediction_results = calibrated_simulation.run(num_steps=200)

    print(f"Predicted infections at day 200: {prediction_results['I'][-1]:.0f}")
else:
    print(f"Calibration did not converge: {result.termination_reason}")

Important: The Calibrator returns a CalibrationResult object containing the optimized parameter values, but does not automatically update your model. You must manually create a new model with the calibrated parameters to use them for predictions.

Best Practices¶

1. Choose Appropriate Bounds¶

Set realistic bounds based on biological knowledge:

# Too wide: allows unrealistic values
CalibrationParameter(id="beta", min_bound=0.0, max_bound=100.0)  # Bad

# Reasonable: based on disease characteristics
CalibrationParameter(id="beta", min_bound=0.1, max_bound=0.8)    # Good

2. Provide Good Initial Guesses¶

Initial guesses can speed up convergence:

# Let optimizer choose random
CalibrationParameter(id="beta", min_bound=0.1, max_bound=0.8)

# Provide informed starting point
CalibrationParameter(
    id="beta",
    min_bound=0.1,
    max_bound=0.8,
    initial_guess=0.4  # Based on literature
)

3. Use Appropriate Loss Functions¶

SSE/RMSE: When all observations equally important
MAE: When outliers present in data
Weighted SSE: When some observations more reliable

4. Choose the Right Algorithm¶

Particle Swarm: Complex landscapes, global search, multiple parameters
Nelder-Mead: Smooth problems, fewer parameters, faster convergence

5. Validate Results¶

Always check biological plausibility:

result = calibrator.run()

# Check parameter values make sense
if result.best_parameters['beta'] > 1.0:
    print("Warning: Unusually high transmission rate")

# Check final loss
if result.final_loss > 1000:
    print("Warning: Poor fit quality")

# Verify convergence
if not result.converged:
    print(f"Warning: Did not converge - {result.termination_reason}")

Model Calibration¶

Overview¶

Basic Example¶

Loss Functions¶

Sum of Squared Errors (SSE)¶

Root Mean Squared Error (RMSE)¶

Mean Absolute Error (MAE)¶

Weighted SSE¶

Optimization Algorithms¶

Particle Swarm Optimization (PSO)¶

Nelder-Mead Simplex¶

Advanced Examples¶

Multi-Compartment Calibration¶

Calibrating Stratified Models¶

Using Calibration Results¶

Best Practices¶

1. Choose Appropriate Bounds¶

2. Provide Good Initial Guesses¶

3. Use Appropriate Loss Functions¶

4. Choose the Right Algorithm¶

5. Validate Results¶

Troubleshooting¶

Calibration Not Converging¶

Poor Fit Quality¶

See Also¶