Model Dashboard
Phase 40 • Training
Model Story
A story-driven view of each run: trust (performance), drivers (what mattered), adaptation (how it shifts in stress), and blind spots (when it fails).
Select a target and a run to load the model story.
Reliability (Calibration)
Verification: Does "60% probability" actually mean 60% win rate? Points should hug the diagonal line.
Reality Check
Select a run to compute backtest performance…
Prediction Confidence
Select a run to compute prediction uncertainty…
Evolution of Trust
Track how allocation rotates across base models over time (stacked to 100%), with the target-specific benchmark overlaid to reveal who is driving decisions in each regime. Represents the StackedEnsemble model.
Signal Forensics
Compare the stacked ensemble (blue) against the actual target (black) and the underlying base models (dotted).
Model Stability (Rolling Log Loss)
Consistency check: Spikes indicate regime failures where the model stopped understanding the market.
Blind Spots
Select a run to analyze error patterns (e.g., error vs volatility) and identify failure modes…
Comparison
Compare candidate models against the Conservative (Voting) ensemble. Blue indicates the Champion. Green indicates the Voting System.
The Brain
The final feature set used by this run, grouped by type (Volatility / Momentum / Macro / Other).
Numbers represent feature importance scores (relative contribution to prediction).
Drivers
Ranked importance across the base model layer (normalized). Shows what drove predictions most.