NUFC by Numbers: 2013

Friday 8 February 2013

A Quick Check on Cheik

Cheik Tiote was widely regarded as one of the Premier League's success stories last season. However, the general perception amongst many Newcastle fans and the media is that Tiote's performance levels have dropped-off this season, with even Alan Pardew recently commenting that "this year he's struggled a bit". Given Newcastle's recent upturn in fortunes, brought about by the arrival of several January transfers, Tiote might even find it difficult to regain his place in the team's starting eleven upon his return from the Africa Cup of Nations. Do Tiote's performance statistics agree with popular opinion, or can he regard himself as being treated a bit unfairly?

In this post I will therefore attempt to answer two questions:

is there evidence of a worsening in Cheik Tiote's performance levels, relative to last season?
are changes in Cheik Tiote's performance levels simply a reflection of a wider trend in performance levels for the entire team? (In other words, how have Tiote's performance levels changed since last season, relative to those of his team mates?)

In order to answer these questions, I'll use several measures that most football fans would probably perceive as being important indicators of performance for a defensive central midfield player. I'll show how these measures have changed between the 2011/12 and 2012/13 Premier League seasons for both Cheik Tiote and a number of his team mates (the "comparison group"). Rather than comparing raw numbers of activities (e.g. total number of tackles made) or average numbers of activities per match (e.g. average number of tackles made per match), I'll divide the total through by each player's total game time over a season to derive a "per minute" measure, so that changes in game time between seasons do not influence the results. The average measures I'll present are also game-time weighted. I'll restrict the comparison group to outfield players who have played at least 500 minutes of Premier League football for Newcastle in each of the two seasons, so that the results are not skewed by extreme values. The results of the analysis are summarised in tables 1-6 below.

Table 1: Tackles per minute

Table 1 shows that Tiote has put in more tackles per minute this season than he did last season; his increase is the largest of the midfield players in the comparison group, and is bettered only by Danny Simpson and Papiss Cisse. However, table 2 shows that Tiote's increase in tackles per minute is accompanied by an increase in fouls per minute. His increase in foul rate is the fourth largest of the 13 players in the group, and is above the group average, but it is not as large as that of fellow central midfielder Yohan Cabaye.

Table 2: Fouls per minute

Table 3 illustrates that Tiote has increased the number of interceptions he's made per minute when compared to last season, and is one of only two players in the comparison group to do this. Meanwhile, table 4 shows that Tiote has been dispossessed less frequently than he was last season, and his decrease is the greatest of all 13 players in the group.

Table 3: Interceptions per minute

Table 4: Dispossessed per minute

According to table 5, Tiote has attempted more passes per minute than he did last season, and his increase is the greatest of all 13 players in the comparison group. On average, passes attempted per minute for the entire group has fallen since last season. However, table 6 shows that Tiote's pass success rate (the proportion of attempted passes that found a team mate) is down on last season, by 0.4 percentage points, whilst the average for the group is up, by 2.5 percentage points. Two other players in the group - central midfielder Yohan Cabaye and play-maker Hatem Ben Arfa - have also experienced falls in pass success rate, and in both cases the deterioration in performance is greater than that of Tiote.

Table 5: Passes attempted per minute

Table 6: Pass success rate

In summary, Cheik Tiote has made more tackles and interceptions per minute, and has been dispossessed less frequently, when compared to last season. However, he's made more fouls per minute than he did last season and, although he's attempted more passes per minute, his pass success rate has fallen. In areas where Tiote has improved, his improvement is typically greater than that of the majority of his team mates (he is in fact the biggest improver against some of the measures I've presented). Conversely, in areas where Tiote has deteriorated, his deterioration is also typically greater than that of the majority of his team mates.

Whilst an apparent decline in Tiote's performance - relative to that of his team mates - according to certain measures will probably align with the perceptions of many Newcastle fans, perhaps evidence that he has made relative improvements on last season according to other measures will be more of a surprise. It might also be worth noting that, for those measures against which Tiote has deteriorated, some of the players who have deteriorated more have largely been praised for their performances so far this season. This apparent differential in perceptions might be party explained by measures of performance not considered in this post, such as chance creation and shooting accuracy. Furthermore, it's likely that Tiote is a victim of the defensive role he plays in the centre of Newcastle's midfield, where every misplaced pass and every mistimed tackle is likely to be more costly to the team, and therefore be subject to a greater level of scrutiny, than those committed by his more attack-minded peers.

Friday 25 January 2013

Estimating Overachievement in the Premier League

In this post, I'll show how we can use a statistical model to make predictions of Premier League finishing position from measures of performance. We might then interpret deviations from these predictions as an indication of a team's overachievement or underachievement. I'll present the results for all teams participating in the 2012/13 Premier League season, and then focus a little bit on Newcastle United. Newcastle's unlikely 5th place finish in the 2011/12 season has already been analysed by various authors, for example in this examination of "luck" by Mark Taylor, but it is hoped that this post will add further insight.

Defining "performance"

Our first task is to define performance in a way that is measurable. A single measure of performance of course does not exist, but it might be possible to approximate it using a variety of measures that are available. To start with, perhaps we can think of possession as being a good indicator of performance; in principle, if a team is able to hold on to the ball for 100% of a match then it will maximise its chances of scoring and minimise that of their opponent. Chart 1 illustrates that possession is indeed correlated with Premier League finishing position. However, possession doesn't necessarily mean goals; keeping the ball for long periods of a game will be futile if the team isn't able to turn possession into goal scoring chances. Equally, if the team rarely relinquishes possession, but defends poorly when it does, then it is likely that the team will concede relatively many goal scoring chances. Therefore, we might also want to include 'average number of shots taken per game' and 'average number of shots conceded per game' in our definition of performance. However, not all teams typically create - and concede - the same quality of shots on goal. We can define shot quality in terms of both distance from goal and the position from where the shot was taken. Teams who are typically able to create chances in the penalty area, and in the centre of the pitch, can expect to score a relatively high number of goals, whilst those who typically concede such chances can expect to concede a relatively high number of goals.

Chart 1: Possession vs league position (last four seasons)

So, now we've got our definition of "performance":

average percentage possession
average number of shots on goal (both taken and conceded)
proportion of shots in the penalty area (both taken and conceded)
proportion of shots in the centre of the pitch (both taken and conceded)

This is obviously a simplification of a complex reality, but it is hoped that these factors capture some of the most important elements of a team's performance (at least to a typical football fan - like me - who would generally prefer their team to play "attractive football", rather than try to "win ugly").

A model to predict league position from performance level

The next step is to incorporate these factors into a statistical model in order to make predictions of a team's league position, given their typical level of performance over the course of a season. The cumulative logit model (otherwise known as the proportional odds or ordered logistic regression model) is appropriate for this job; it is designed for situations where the variable of interest (league position in this case) has a natural order. Such a model has been used previously to predict outcomes in football, for example see this post by Zach Slaton on predicting results for individual matches. Unfortunately, we are not able to make predictions of individual league positions; we only have four seasons worth of data (2009/10 to 2012/13), so we only observe performance levels - and corresponding league positions - for four first placed teams, four second placed teams, four third placed teams, and so on, which is insufficient for producing reliable estimates. To get around this problem, we can group the positions into 1-4, 5-8, 9-12, 13-16 and 17-20.

For any team in any Premier League season between 2009/10 and 2012/13, the cumulative logit model gives us the predicted probability of the team finishing in the 1-4 group, the 5-8 group, and so on. The predictions are based on typical performance levels - and corresponding league positions - of all teams over the four-year period. We can take the group with the highest probability as being the team's most likely finishing group, given their typical performance level for the season. If the team actually finishes in the group with the highest predicted probability, then they have finished "where they deserve" according to their typical level of performance, as we have defined it. However, if they actually finish above where the model predicts, they have overachieved. Conversely, if they actually finish below where the model predicts, they have underachieved. So, for example, a team who finished 10th will have overachieved by between 3 and 6 places if they were predicted to have finished in the 13-16 group. On the other hand, the same team will have underachieved by between 6 and 9 places if they were predicted to have finished in the 1-4 group. Any estimated overachievement or underachievement by a team will be due to a combination of two factors:

elements of the team's performance that are not well captured by our model
factors other than performance, including random events that tend to cancel out over the course of a season - this is typically described by the footballing community as "luck" (for example, benefiting from favourable refereeing decisions or staying injury-free for most of the season) and is largely not under the team's control

Some results from the model

Table 1 shows overachievement levels for each of the 20 teams participating in the 2012/13 Premier League season (after 23 games). Note that negative overachievement indicates underachievement. Stoke City are overachieving the most, by between 7 and 10 places, as they are currently in 10th place but their predicted probability is maximised in the 17-20 group. Wigan Athletic are underachieving the most, by between 7 and 10 places, as they are currently in 19th place but their predicted probability is maximised in the 9-12 group. The shading in table 1 is determined by quintile, so that the smallest 20 probabilities are shaded lightest and the largest 20 probabilities are shaded darkest.

Table 1: Overachievement by team (2012/13)

Let's take a more detailed look at the results for Newcastle United, which are shown for the last three Premier League seasons (the team played in the Championship in the 2009/10 season) in chart 2 and table 2 - where the shading again reflects quintiles, but this time over the whole four-season dataset. Newcastle underachieved in 2010/11, by between 4 and 7 places; the team finished 12th but their predicted probability was maximised in the 5-8 group, closely followed by the 9-12 group. The profile of predicted probabilities in 2011/12 was very similar to that in 2012/13, with the predicted probability maximised in the 13-16 group for both seasons. Newcastle are therefore neither overachieving nor underachieving so far this season, as they currently lie in 16th place. However, the team overachieved by between 8 and 11 places last season, eventually finishing in 5th place. This is the joint largest overachievement by any Premier League team over the past four seasons, matched only by Birmingham City's 9th place finish in 2009/10, as illustrated by table 3. These results suggest that Newcastle's typical performance levels (defined by possession, and numbers and quality of shots taken/conceded) didn't change dramatically between the 2011/12 and 2012/13 seasons, but other factors contributed to the team's 5th place finish in 2011/12. These "other factors" include elements of performance that are not well captured by our model, and factors other than performance that are not under the team's control.

Table 2: NUFC's overachievement by season

Chart 2: NUFC's predicted probabilities by position

Table 3: Top three overachievers (last four seasons)

Friday 18 January 2013

A Results Driven Business

Newcastle's 2-0 defeat away to Brighton in the FA cup two weeks ago was their eleventh loss in 14 games. The poor run of form started with the 1-0 home defeat to West Ham back in November of last year and, since then, Newcastle have won just twice, at home to both QPR and Wigan in December.

This made me wonder: how poor does a team's form has to be before the manager is fired?

Below I've shown the form for all Premier League managers fired in the last three seasons over their final 14 games prior to be being fired (sorted by games lost), plus that of Alan Pardew. No one has been allowed to lose 11 out of their final 14 matches. Whilst expectations amongst Newcastle's fans are unlikely to be up there with those of Chelsea or Tottenham, fans of so-called "lesser clubs" like QPR and Wolves have had to endure shorter runs of poor form before they lost their managers than that served up by Alan Pardew's team.

Of course, this isn't to say that there aren't Premier League managers still in employment who have overseen even worse runs of form than 11 losses in 14 games - this would take quite a bit more work to investigate - but it does provide an indication of quite how serious Newcastle's poor run of results has been.

Form (last 14 games, all competitions) of all Premier League

managers fired in the last three seasons, plus Alan Pardew