xG AND THE DATA REVOLUTION

06/10/17

International breaks are fun aren't they? With the absence of any domestic football to talk about, Andrew Lawn took a look at the emergence of "xG" and football's data revolution.

"Knowledge itself is power" the well-known phrase, widely attributed to Sir Francis Bacon came to my mind this week when, amid the delight at another hard-fought and thoroughly deserved away win, were scattered references bemoaning the slow but steady creep of stats into the analysis of the beautiful game.

The main victim (or culprit) depending on your point of view for this disdain was xG or 'expected goals', although on a similar theme 'expected points' is also derided.

I can understand where this derision comes from. First, there is the basic rule that anything new is viewed with suspicion, particularly in English football. Second, we have recently witnessed the rise and fall of possession being king.

The domestic and international success of professional ball hoarders Barcelona and Spain from 2008-2014, led to possession percentages being given much more publicity and credence. The more Spain and Barca dominated, the more winning the possession percentage battle was desired.

This began to change when Jose Mourinho started to achieve success with first Inter and then Chelsea by allowing teams to have the ball, sitting deep and then springing forward in rare counter-attacks. Suddenly winning games with as low at 18% possession became possible and the desirability of bossing the ball became seen as a myth and, with the rise of the infuriating saying, "the only stat that matters is goals" exposed the flaw of relying (solely) on stats.

The issue however was not that stats are flawed, but that analysis of them can be and it is here where xG comes in.

For those unfamiliar with the term, xG essentially translates as "the number of goals a team would be expected to score, based on the quality of the chances they create". The word quality is highlighted there as that is the key piece of information and why xG is more revealing than the traditional "shots" and "shots on target" which have graced match stats for time immemorial.

xG therefore effectively evaluates "chances", whereas "shots on goal" does not discriminate between a speculative Tettey 30-yarder (likelihood of scoring 1 in 757) and Marco Stiepermann prodding wide from deep inside the Bristol City 6-yard box (likelihood of scoring (95 in 100).

As such xG gives a far more reliable picture as to how games panned out, based on which team actually played better. For example, take Germany's 7-1 win against Brazil in the 2014 World Cup. Brazil actually had more shots and possession in that game, but were way down on xG compared to the triumphant Germans.

As this data is built up, a picture can then be extrapolated to make predictions, in the same way that our own experiences as fans.

Take Tettey again as the example and imagine him marauding forward into empty space midway through the opponent's half. Now we have all witnessed Tettey in this position spanking one into the top corner, but experience has taught us that this is less likely than the ball whistling its way toward the Upper Barclay. Compare that to say Bradley Johnson, also a central midfielder but one who is far more likely to hit the back of the net from the same position than take out a fan.

These scenarios and experiences feed into xG in two ways. First, assuming both happened in the same game the Bradley Johnson effort would be given a higher xG post-game, regardless of whether he scored it or not, because experience shows he was more likely to score.

It would also work towards predicting what might happen in the future ie, having Bradley Johnson in the team would give you a higher xG prediction pre-game, for the same reason, ie if put he and Tettey were in that position, Johnson would be more likely to score.

Is the system perfect? No of course not, first it doesn't account for individual moments of unexpected brilliance (Tettey - Sunderland) and nor does it count how having a Tettey anchor a midfield makes it more likely that Bradley Johnson will have the freedom to wander forward.

No stat is perfect, especially in isolation, but not being perfect doesn't make them useless and combined with the other stats, adds up to a much more complete picture of a game, in turn allowing us as fans and Daniel Farke as manager to make more informed (if not better) decisions.

As Sir Francis said; "knowledge is itself power".


While you're here.

Two things; first, did you know we have a new podcast? It's available via iTunes and Soundcloud. Alternatively you can just listen to it in your browser via our podcast page.

Secondly, to keep AlongComeNorwich advert free and to help us fund additional initiatives aimed at improving the Carrow Road atmosphere, we occasionally produce exclusive t-shirts.

To celebrate the latest Barclay classic chant, Farkelife we've done another one. As always, we take no profit from these and put all the revenue back into the site and things we can all enjoy.

Comments

There are no comments on this article yet.

Leave a Reply

Your email address will not be published. Required fields are marked *

4 away wins in a row and a return of the kind of dogged resolve we'd all craved since Alex Neil lost his way home from St James' Park, Thomas Markham-Uden reviews a game (and run) which it's hard not to get a little bit carried away about.

The ever-popular Di Cunningham from Proud Canaries returns, with a review of the new Justin Fashanu film; Forbidden Games. Step aside Mark Kermode, this is very good.