Draws happen in roughly 1 in 4 professional soccer matches. If your prediction model doesn’t take them seriously, it’s missing a quarter of reality.
Most don’t take them seriously. This post explains why and introduces the rating system we built to fix it.
The Draw Problem in Soccer Prediction
Soccer is the world’s most popular sport and one of the hardest to predict. Low scores mean small margins determine outcomes. High variance means even strong teams lose unexpectedly. And unlike most North American sports, every game has three possible outcomes: home win, draw, or away win.
That third outcome, the draw, is where most prediction systems fail.
In our research dataset of 34,042 matches across multiple professional leagues, draws occurred at a 25.4% actual rate. Nearly equal to away wins (29.4%). Yet analysis of bookmaker implied probabilities shows draws were predicted as the top outcome only 1.3% of the time.
This isn’t a measurement error. It’s a structural feature of how prediction markets work.
Why bookmakers under-predict draws
Bookmakers are not in the business of maximizing forecast accuracy. They’re in the business of managing risk margins across high volumes of bets. Draws are harder to handicap, create unbalanced action, and have higher variance in terms of bettor behavior. So bookmakers price them conservatively.
The result: if you take bookmaker-implied probabilities as your prediction baseline for draws, you’re starting with a signal that systematically underweights them by a factor of roughly 20x.
For casual bettors, this gets absorbed into overall variance. For anyone building a serious prediction model, it’s a design problem.
What Is the Roiz Walss Index (RWI)?
The Roiz Walss Index (RWI) is a dynamic, multi-league soccer rating system designed to generate pre-match strength signals for use in outcome prediction.
It’s the foundation of every soccer matchup card on PredictApp. The expected goal differential you see on a card, that “+1.8” or “-0.6”, is what RWI outputs before kickoff.
Here’s how it works.
Four team states, not one
Most soccer rating systems assign a single number to each team. RWI tracks four separate states per team:
- AH — Home attacking strength
- DH — Home defensive weakness
- AA — Away attacking strength
- DA — Away defensive weakness
This matters because teams perform differently at home and away, not just in magnitude but in quality. A team that dominates possession at home might struggle away. A team that defends well at home might concede more on the road. Collapsing these into a single rating loses that signal.
Expected goals in log space
Before each match, RWI generates expected goal totals using a log-space equation:
log(μ_home) = league_baseline + home_advantage + AH_home − DA_away + ρ(lineup_home − lineup_away)
log(μ_away) = league_baseline + AA_away − DH_home + ρ(lineup_away − lineup_home)
Modeling in log space ensures predicted goals are always positive and allows each component, league baseline, home advantage, team strength, and lineup quality to interact multiplicatively rather than additively after exponentiation.
RWI is then defined as the expected goal difference:
RWI = μ_home − μ_away
A positive RWI means the model favors the home team by that expected margin. A negative RWI favors the away team.
League-specific priors
Different leagues score at different rates. A 2-1 result means something different in the Bundesliga than in Ligue 1. RWI computes league-specific priors: average goals per team, goal-difference standard deviation, log-goal baseline, and home-advantage coefficient using cumulative historical data shifted to prevent data leakage.
This makes RWI directly comparable across leagues. The standardized version (RWI_z) divides by the league’s prior standard deviation of goal difference, so a +1.5 in a high-scoring league is calibrated against a +1.5 in a low-scoring one.
Lineup adjustment
Before each match, RWI adjusts expected goals based on the announced lineups.
For each player, the model computes a stabilized value:
V_player = τ × P_player + (1 − τ) × P_team_position_mean
Where τ (trust) increases with accumulated minutes played, using a shrinkage constant of 270 minutes (roughly 3 full matches). A player with limited history gets pulled toward their team’s positional average. A veteran gets closer to their personal rating.
Each team’s lineup score is then the weighted average of player values, with starters receiving full weight (minutes/90) and substitutes receiving half weight (0.5 × minutes/90).
The difference in lineup scores becomes a multiplicative correction in the goal model. Rotation squads get lower expected output. Strong lineups get a boost.
Validation Results: 34,042 Matches
We evaluated RWI on a dataset covering 34,042 matches across multiple leagues, tracking both continuous signal quality and probabilistic outcome accuracy.
Continuous signal performance
| Metric | Value |
| Goal-difference MAE | 1.299 |
| Goal-difference RMSE | 1.679 |
| Pearson correlation (vs. actual GD) | 0.330 |
| Spearman correlation | 0.323 |
| Bias | 0.102 |
After the model’s burn-in period (allowing team ratings to stabilize from initial states), correlations improved to 0.350 Pearson and 0.339 Spearman.
RWI as a feature, not the final probability
It is important to be precise about what RWI is and isn’t.
RWI is a strength signal: the expected goal differential based on team ratings, lineup, home advantage, and league baseline. It is designed to be a feature fed into downstream probability models, not to produce win/draw/loss probabilities on its own. The distinction matters for understanding the full system.
How RWI Updates After Each Match
Ratings update sequentially after every match using a bounded residual transform:
ψ(e) = sign(e) × log(1 + |e|)
This means large surprises (a 5-0 result the model didn’t anticipate) cause a meaningful but bounded rating update. It doesn’t create runaway swings. It mirrors the diminishing-returns philosophy of Pi-ratings (Constantinou & Fenton, 2013), extended with a cross-venue catch-up mechanism and mild shrinkage toward zero.
The update target itself is a blend of observed goals and a walk-forward xG* proxy built from gradient-boosted Poisson regression on match statistics. At a 0.9 blend weight toward actual goals, the model stays grounded in what happened while smoothing extreme flukes.
Player ratings update from the residual in goal difference, not from individual statistics. If the home side outperforms the model’s expectations, home players get a positive update; away players get the symmetric negative. Trust grows with minutes played.
From Strength Signal to Match Probabilities
RWI generates the expected goal differential. What turns that into win/draw/loss probabilities is a separate layer: two machine learning models trained on RWI outputs and match context.
The first is an Ordered Logit model, which treats match outcomes as an ordered classification (away win → draw → home win) and estimates probabilities across all three classes. The second is a CatBoost gradient-boosted tree, which captures non-linear interactions between RWI signals, league context, and historical patterns.
We blend their outputs and evaluate the result against the bookmaker baseline across eight full seasons.
| Metric | PredictApp | Bookmaker baseline |
| Log loss | 0.9797 | 0.9779 |
| Accuracy | 52.45% | 52.53% |
| Macro F1 | 0.3959 | 0.3900 |
| Ranked Probability Score | 0.1975 | 0.1974 |
| Draw recall | 1.97% | ~0% (structural) |
On log loss, accuracy, and RPS, our model is within statistical noise of bookmaker-implied odds across eight seasons. That’s the baseline. Matching it means the model is calibrated against the sharpest signal in the market.
The difference is in macro F1 and draw recall.
Macro F1 weights all three outcome classes equally. Bookmakers have no incentive to be right on draws specifically; they have every incentive to be right on aggregate handle. Our model does slightly better on F1 because it is actually trying to predict all three outcomes, including the one that happens 25% of the time.
Draw recall is where the structural gap is clearest. Bookmakers predict draws as the top outcome less than 1% of the time. Our model predicted draws correctly 1.97% of the time on average across eight seasons. That may sound small, but it is real signal where the market’s signal is near-zero by design.
Why This Matters for Fans
You don’t need to understand the math to benefit from what it produces.
Every soccer matchup card on PredictApp shows the RWI: the expected goal difference based on each team’s attacking and defensive strength, their lineup today, home advantage, and how this league scores on average. Behind that number is a model calibrated against eight seasons of real match outcomes.
When RWI is close to zero, the model sees a competitive game. Those are exactly the matches where draws are most likely, and where our probability estimates diverge from bookmaker pricing. Not because we’re guessing, but because we’re measuring something the market systematically ignores.
That’s not a guaranteed edge. No model produces those. But it’s a more complete picture of the match than you get from odds built to manage margin, not to predict every outcome.
The Methodology
The full RWI methodology is described in the working paper (waiting to be published):
> *Roiz Walss Index: A Dynamic Multi-League, Lineup-Aware Rating System for Soccer Match Prediction* — Jose Daniel Roiz Walss, PredictApp (2026)
It covers the complete mathematical specification, the xG* proxy model, player update mechanics, card adjustment design, and the full evaluation on 34,042 matches.
If you want a copy of the paper please send an email to support@predictapp.io
The free matchup stats — including RWI for upcoming fixtures — are live at predictapp.io.
—
References
– Constantinou, A. C., & Fenton, N. E. (2013). Determining the level of ability of football teams by dynamic ratings based on the relative discrepancies in scores between adversaries. *Journal of Quantitative Analysis in Sports*, 9(1), 37–50.
– Hvattum, L. M., & Arntzen, H. (2010). Using ELO ratings for match result prediction in association football. *International Journal of Forecasting*, 26(3), 460–470.
– Berrar, D., Lopes, P., & Dubitzky, W. (2018). Incorporating domain knowledge in machine learning for soccer outcome prediction. *Machine Learning*, 108(1), 97–126.
– Hubáček, O., Šourek, G., & Železný, F. (2018). Learning to predict soccer results from relational data with gradient boosted trees. *Machine Learning*, 108(1), 29–47.