Draws happen in roughly 1 in 4 professional soccer matches. If your prediction model doesn’t take them seriously, it’s missing a quarter of reality.

Most don’t take them seriously. This post explains why and introduces the rating system we built to fix it.

The Draw Problem in Soccer Prediction

Soccer is the world’s most popular sport and one of the hardest to predict. Low scores mean small margins determine outcomes. High variance means even strong teams lose unexpectedly. And unlike most North American sports, every game has three possible outcomes: home win, draw, or away win.

That third outcome, the draw, is where most prediction systems fail.

In our research dataset of 34,042 matches across multiple professional leagues, draws occurred at a 25.4% actual rate. Nearly equal to away wins (29.4%). Yet analysis of bookmaker implied probabilities shows draws were predicted as the top outcome only 1.3% of the time.

This isn’t a measurement error. It’s a structural feature of how prediction markets work.

Why bookmakers under-predict draws

Bookmakers are not in the business of maximizing forecast accuracy. They’re in the business of managing risk margins across high volumes of bets. Draws are harder to handicap, create unbalanced action, and have higher variance in terms of bettor behavior. So bookmakers price them conservatively.

The result: if you take bookmaker-implied probabilities as your prediction baseline for draws, you’re starting with a signal that systematically underweights them by a factor of roughly 20x.

For casual bettors, this gets absorbed into overall variance. For anyone building a serious prediction model, it’s a design problem.

What Is the Roiz Walss Index (RWI)?

The Roiz Walss Index (RWI) is a dynamic, multi-league soccer rating system designed to generate pre-match strength signals for use in outcome prediction.

It’s the foundation of every soccer matchup card on PredictApp. The expected goal differential you see on a card, that “+1.8” or “-0.6”, is what RWI outputs before kickoff.

Here’s how it works.

Four team states, not one

Most soccer rating systems assign a single number to each team. RWI tracks four separate states per team:

This matters because teams perform differently at home and away, not just in magnitude but in quality. A team that dominates possession at home might struggle away. A team that defends well at home might concede more on the road. Collapsing these into a single rating loses that signal.

Expected goals in log space

Before each match, RWI generates expected goal totals using a log-space equation:


log(μ_home) = league_baseline + home_advantage + AH_home − DA_away + ρ(lineup_home − lineup_away)

log(μ_away) = league_baseline + AA_away − DH_home + ρ(lineup_away − lineup_home)

Modeling in log space ensures predicted goals are always positive and allows each component, league baseline, home advantage, team strength, and lineup quality to interact multiplicatively rather than additively after exponentiation.

RWI is then defined as the expected goal difference:

RWI = μ_home − μ_away

A positive RWI means the model favors the home team by that expected margin. A negative RWI favors the away team.

League-specific priors

Different leagues score at different rates. A 2-1 result means something different in the Bundesliga than in Ligue 1. RWI computes league-specific priors: average goals per team, goal-difference standard deviation, log-goal baseline, and home-advantage coefficient using cumulative historical data shifted to prevent data leakage.

This makes RWI directly comparable across leagues. The standardized version (RWI_z) divides by the league’s prior standard deviation of goal difference, so a +1.5 in a high-scoring league is calibrated against a +1.5 in a low-scoring one.

Lineup adjustment

Before each match, RWI adjusts expected goals based on the announced lineups.

For each player, the model computes a stabilized value:

V_player = τ × P_player + (1 − τ) × P_team_position_mean

Where τ (trust) increases with accumulated minutes played, using a shrinkage constant of 270 minutes (roughly 3 full matches). A player with limited history gets pulled toward their team’s positional average. A veteran gets closer to their personal rating.

Each team’s lineup score is then the weighted average of player values, with starters receiving full weight (minutes/90) and substitutes receiving half weight (0.5 × minutes/90).

The difference in lineup scores becomes a multiplicative correction in the goal model. Rotation squads get lower expected output. Strong lineups get a boost.

Validation Results: 34,042 Matches

We evaluated RWI on a dataset covering 34,042 matches across multiple leagues, tracking both continuous signal quality and probabilistic outcome accuracy.

Continuous signal performance

MetricValue
Goal-difference MAE 1.299
Goal-difference RMSE1.679
Pearson correlation (vs. actual GD)0.330
Spearman correlation0.323
Bias 0.102

After the model’s burn-in period (allowing team ratings to stabilize from initial states), correlations improved to 0.350 Pearson and 0.339 Spearman.

RWI as a feature, not the final probability

It is important to be precise about what RWI is and isn’t.

RWI is a strength signal: the expected goal differential based on team ratings, lineup, home advantage, and league baseline. It is designed to be a feature fed into downstream probability models, not to produce win/draw/loss probabilities on its own. The distinction matters for understanding the full system.

How RWI Updates After Each Match

Ratings update sequentially after every match using a bounded residual transform:

ψ(e) = sign(e) × log(1 + |e|)

This means large surprises (a 5-0 result the model didn’t anticipate) cause a meaningful but bounded rating update. It doesn’t create runaway swings. It mirrors the diminishing-returns philosophy of Pi-ratings (Constantinou & Fenton, 2013), extended with a cross-venue catch-up mechanism and mild shrinkage toward zero.

The update target itself is a blend of observed goals and a walk-forward xG* proxy built from gradient-boosted Poisson regression on match statistics. At a 0.9 blend weight toward actual goals, the model stays grounded in what happened while smoothing extreme flukes.

Player ratings update from the residual in goal difference, not from individual statistics. If the home side outperforms the model’s expectations, home players get a positive update; away players get the symmetric negative. Trust grows with minutes played.

From Strength Signal to Match Probabilities

RWI generates the expected goal differential. What turns that into win/draw/loss probabilities is a separate layer: two machine learning models trained on RWI outputs and match context.

The first is an Ordered Logit model, which treats match outcomes as an ordered classification (away win → draw → home win) and estimates probabilities across all three classes. The second is a CatBoost gradient-boosted tree, which captures non-linear interactions between RWI signals, league context, and historical patterns.

We blend their outputs and evaluate the result against the bookmaker baseline across eight full seasons.

Metric PredictApp Bookmaker baseline
Log loss0.97970.9779
Accuracy52.45%52.53%
Macro F10.39590.3900
Ranked Probability Score0.1975 0.1974
Draw recall1.97%~0% (structural)

On log loss, accuracy, and RPS, our model is within statistical noise of bookmaker-implied odds across eight seasons. That’s the baseline. Matching it means the model is calibrated against the sharpest signal in the market.

The difference is in macro F1 and draw recall.

Macro F1 weights all three outcome classes equally. Bookmakers have no incentive to be right on draws specifically; they have every incentive to be right on aggregate handle. Our model does slightly better on F1 because it is actually trying to predict all three outcomes, including the one that happens 25% of the time.

Draw recall is where the structural gap is clearest. Bookmakers predict draws as the top outcome less than 1% of the time. Our model predicted draws correctly 1.97% of the time on average across eight seasons. That may sound small, but it is real signal where the market’s signal is near-zero by design.

Why This Matters for Fans

You don’t need to understand the math to benefit from what it produces.

Every soccer matchup card on PredictApp shows the RWI: the expected goal difference based on each team’s attacking and defensive strength, their lineup today, home advantage, and how this league scores on average. Behind that number is a model calibrated against eight seasons of real match outcomes.

When RWI is close to zero, the model sees a competitive game. Those are exactly the matches where draws are most likely, and where our probability estimates diverge from bookmaker pricing. Not because we’re guessing, but because we’re measuring something the market systematically ignores.

That’s not a guaranteed edge. No model produces those. But it’s a more complete picture of the match than you get from odds built to manage margin, not to predict every outcome.

The Methodology

The full RWI methodology is described in the working paper (waiting to be published):

> *Roiz Walss Index: A Dynamic Multi-League, Lineup-Aware Rating System for Soccer Match Prediction* — Jose Daniel Roiz Walss, PredictApp (2026)

It covers the complete mathematical specification, the xG* proxy model, player update mechanics, card adjustment design, and the full evaluation on 34,042 matches.

If you want a copy of the paper please send an email to support@predictapp.io

The free matchup stats — including RWI for upcoming fixtures — are live at predictapp.io.

References

– Constantinou, A. C., & Fenton, N. E. (2013). Determining the level of ability of football teams by dynamic ratings based on the relative discrepancies in scores between adversaries. *Journal of Quantitative Analysis in Sports*, 9(1), 37–50.

– Hvattum, L. M., & Arntzen, H. (2010). Using ELO ratings for match result prediction in association football. *International Journal of Forecasting*, 26(3), 460–470.

– Berrar, D., Lopes, P., & Dubitzky, W. (2018). Incorporating domain knowledge in machine learning for soccer outcome prediction. *Machine Learning*, 108(1), 97–126.

– Hubáček, O., Šourek, G., & Železný, F. (2018). Learning to predict soccer results from relational data with gradient boosted trees. *Machine Learning*, 108(1), 29–47.