Concepts: inference and surrogates
This page provides a conceptual (scientific) overview of the main building blocks in gp_active_mcmc. It is intended to complement the API reference with the “why” and “how the pieces fit together”.
Notation
Let
- \(\theta \in \mathbb{R}^d\) be the parameter vector,
- \(y \in \mathbb{R}^{n}\) be the model output in observation space (e.g., a time series sampled on a grid),
- \(y_{\text{obs}} \in \mathbb{R}^{n}\) be observed data,
- \(C_{\text{obs}} \in \mathbb{R}^{n \times n}\) be the observation-noise covariance,
- \(f_{\text{HF}}(\theta)\) be the high-fidelity forward model,
- \(f_{\text{LF}}(\theta)\) be the low-fidelity surrogate (learned approximation).
The library targets workflows where \(f_{\text{HF}}\) is accurate but expensive, while \(f_{\text{LF}}\) is cheap but imperfect.
Inference: active learning inside MCMC
The core abstraction: the active model
ActiveMCMCModel is the heart of the inference layer. It couples a low-fidelity surrogate and a high-fidelity forward model and exposes two callables designed to be plugged into tinyDA posteriors:
coarse(theta): evaluate LF first; optionally trigger HF if LF is deemed unreliable,fine(theta): always evaluate HF and update LF using the new HF information.
See: ActiveMCMCModel.
Coarse evaluation and HF triggering
In the default implementation, the coarse evaluation returns either:
- an LF prediction with uncertainty as a CoarseOutput, or
- a raw HF output \(y_{\text{HF}}\) when a trigger condition activates.
A typical trigger is based on the surrogate’s predictive variance:
where \(v(\theta)\) is the marginal predictive variance returned by the surrogate and \(\gamma\) is a user parameter (see gamma_threshold).
This is intentionally cheap: it compresses uncertainty into a single scalar that is easy to monitor during sampling.
Fine evaluation and online surrogate updates
A fine evaluation computes \(y_{\text{HF}} = f_{\text{HF}}(\theta)\) and updates the surrogate with the new pair \((\theta, y_{\text{HF}})\). This “learn while sampling” mechanism is the library’s active-learning component.
Choosing the inference mode (posterior selection)
In gp_active_mcmc, the posterior argument is algorithmic: it determines how tinyDA orchestrates coarse vs fine evaluations.
Mode A — single posterior (MCMC-guided active learning)
You pass a single posterior that uses the coarse model:
posterior = Posterior(prior, loglike, model.coarse)- run: sample_active_chain
Interpretation:
- The chain is driven by LF evaluations.
- HF calls happen only when
ActiveMCMCModel.coarsetriggers them internally.
When to use:
- When you want the simplest workflow and do not need a formal delayed-acceptance mechanism.
Mode B — two posteriors (DA-MCMC guided active learning)
You pass a list of two posteriors [coarse, fine]:
- coarse posterior uses
model.coarse, - fine posterior uses
model.fine, - run: sample_active_chain
Interpretation:
- This corresponds to delayed-acceptance MCMC (DA-MCMC): proposals are first screened with the cheap level and then corrected by the expensive level at a frequency controlled by
subsampling_rate.
When to use:
- When HF must participate in the acceptance/rejection mechanism in a principled delayed-acceptance scheme.
Mode C — adaptive DA-MCMC (recommended)
You use DA-MCMC (two posteriors) and attach an adaptive subchain policy to the active model:
- construct AdaptiveSubchain,
- pass it to
ActiveMCMCModel(..., adaptive=...), - use two posteriors
[coarse, fine], - run: sample_adaptive_active_chain
Key constraint:
- DA-MCMC is mandatory for adaptive subchains (two posteriors required).
Why chunking is needed:
tinyDAtakes a fixedsubsampling_ratepersample(...)call. If the subchain length changes online, we must re-entertinyDAin chunks and update the subsampling rate between calls via ChunkedMCMCConfig.
Likelihood: accounting for surrogate uncertainty
When LF predictions come with predictive variance, it is often desirable to reflect that uncertainty in the likelihood.
ActiveGPLogLike is a Gaussian log-likelihood that supports variance inflation when the prediction carries a .variance attribute (typically CoarseOutput):
This yields a likelihood that penalises uncertain surrogate predictions less strongly than confident predictions, reducing the risk of overconfident LF guidance.
See: ActiveGPLogLike.
Adaptive subchains: controlling HF correction frequency
The adaptive subchain policy monitors LF–HF discrepancy during fine calls and updates a “subchain length” (coarse steps between fine corrections).
In the default implementation, the discrepancy is an RMSE in observation space:
and the policy adjusts the subchain length every update_every HF evaluations:
- if
err > target_error: shorten subchain (more frequent HF), - else: lengthen subchain (less frequent HF).
See:
Results and diagnostics (what you can measure)
Sampling entrypoints return a SamplingResult containing:
- an MCMCChain with the sample matrix,
- optional aligned extras ChainExtras (HF usage flags, acceptance flags, subchain history).
These extras enable diagnostics such as:
- HF call fraction (cost proxy),
- subchain length history (adaptive behaviour),
- chain visualisations (trace/pair plots).
See: gp_active_mcmc.diagnostics.
Surrogates: POD, GP, and POD–GP
The surrogate layer provides a standard reduced-order modelling pipeline:
- compress snapshots with POD,
- learn the map \(\theta \mapsto\) POD coefficients with a GP,
- reconstruct predictions in observation space.
The main user-facing surrogate is PODGPSurrogate.
POD: reduced-order representation
Given snapshot trajectories/fields assembled as a matrix
where rows correspond to parameter samples and columns correspond to observation components, POD computes an orthonormal basis \(\Phi \in \mathbb{R}^{n \times r}\) (with \(r \ll n\)) such that
where \(\bar{y}\) is the mean snapshot and \(a(\theta) \in \mathbb{R}^r\) are POD coefficients.
In this library:
POD.fit(Y)estimates \(\bar{y}\) and \(\Phi\),POD.transform(Y)returns coefficients \(A\),POD.inverse_transform(A)reconstructs in observation space.
See: POD.
GP regression for POD coefficients
After POD, the learning problem becomes:
The library implements:
- SingleOutputGP for scalar outputs,
- MultiOutputGP as independent GPs per output dimension (one GP per coefficient).
This design trades modelling simplicity for robustness:
- each coefficient is treated independently,
- predictive means and marginal variances are returned per coefficient.
POD–GP coupling and uncertainty propagation
PODGPSurrogate combines POD and MultiOutputGP:
- GP predicts coefficient mean and variance: $$ \mu_a(\theta), ; v_a(\theta) \in \mathbb{R}^r, $$
- mean reconstruction: $$ \mu_y(\theta) = \bar{y} + \Phi \mu_a(\theta), $$
- variance propagation (diagonal / marginal approximation): $$ v_y(\theta) \approx \sum_{j=1}^{r} \Phi_{\cdot j}^2 , v_{a,j}(\theta), $$ i.e., coefficient uncertainties are mapped to pointwise output variance assuming coefficients are independent.
This is exactly what the active inference layer needs:
- a predictive mean in observation space,
- an aligned marginal predictive variance for uncertainty-aware likelihoods.
See: PODGPSurrogate.
Practical guidance: choosing POD rank and budgets
- POD rank \(r\) controls the bias–variance trade-off:
- too small: high truncation error (biased surrogate),
- too large: coefficients become harder for the GP to learn reliably.
- A common workflow is:
- inspect POD energy curves (retained variance),
- validate on a held-out set (RMSE + coverage),
- use active inference to expand training data in regions visited by the posterior.
See diagnostics helpers: