Model performance

Every forecast is frozen the moment a result is recorded, then scored out-of-sample. Brier, log loss and RPS are all lower-is-better. The market rows benchmark against sportsbook odds (POST them per fixture to /api/odds — sharp closing lines like Pinnacle's are the toughest public baseline; matching them is the realistic target, and the published blend is designed to be at least as sharp as either input alone.

All completed matches

3 matches with stored forecasts

Source	N	Brier	Log loss	RPS
Model (Elo + Poisson)	3	0.5675	0.9389	0.1716
Blend (published)	3	0.5675	0.9389	0.1716

Matches with market odds

Pinnacle lines are stored for all upcoming fixtures — this fills in as the first of them finishes

No data yet.

Predicted vs actual scores

Bar = goals actually scored · tick = pre-match expected goals (frozen at kickoff). Deviations feed the auto-calibration below.

Matches

Outcome calls

1/3

Exact scores

2/3

Goals act / pred

7 / 8.4

MAE (goals)

0.49

Bias

-0.23

A🇲🇽MEX2–0🇿🇦RSApredicted 2.3–0.8 · likely 2–0outcome ✓exact ✓

MEX

-0.3

RSA

-0.8

A🇰🇷KOR2–1🇨🇿CZEpredicted 1.3–1.4 · likely 1–1outcome ✗ (away)

KOR

+0.7

CZE

-0.4

B🇨🇦CAN1–1🇧🇦BIHpredicted 1.7–1.0 · likely 1–1outcome ✗ (home)exact ✓

CAN

-0.7

BIH

+0.0

Auto-calibration — expected goals are currently scaled ×0.960, re-fit from these deviations after every recorded match (shrunk toward 1.000 by a 10-match prior, capped at ±25%), and applied to every forecast and tournament simulation.

Match-by-match

Probability each source gave to the outcome that actually happened (higher = better call)

Match	Score	Model	Market	Blend
🇨🇦CANv🇧🇦BIH	1–1	25.3%	—	25.3%
🇰🇷KORv🇨🇿CZE	2–1	33.5%	—	33.5%
🇲🇽MEXv🇿🇦RSA	2–0	70.4%	—	70.4%