Predicting the course of T20 matches – 1st innings

Unlike ODI’s, the format and basic rules of Twenty20 matches have stayed the same since the very first official match back in 2003.  The first 6 overs in each innings are Powerplay overs and the bowlers are limited to 4 overs each.  This, coupled with the condensed nature of the format, means that certain situations are repeated many times.  Naturally, this allows us to compute probabilities of events by looking back at how many times this situation arose and the resulting outcome.  For example, how often do teams defend 10 runs off the last over?  Or what do teams need to score in the Powerplay to stand a chance of chasing 200?

In this article, I develop some simple prediction models using a few machine learning algorithms.  I use only a few factors to begin with – specifically the current score, number or runs required and wickets to predict the final score in the 1st innings and the winner in the 2nd innings.   The dataset consists of 877,319 balls from 3,700 T20 matches where there was an outright winner and no reduction in overs in either innings.

Predicting 1st innings scores

Predicting the eventual score in the 1st innings is a regression problem, as the outcome is a continuous variable.  I consider the 455,301 1st innings balls in the dataset to be independent of one another.

Screen Shot 2017-04-18 at 23.05.14

The table above shows a random sample of the data.  The first three columns are the predictor variables and the last column is the target variable i.e. what we’re trying to predict.

The first model we can look at is Linear Regression.  Here we are trying to draw a line through our multi-dimensional data such that the total distance between all the points and the line is as small as possible.  In two dimensions this is simply the line of best fit that we are all familiar with.

Using scikit-learn’s machine learning package in Python, we can build our model very easily as below.

Screen Shot 2017-04-18 at 23.23.52.png

Once fitted, the module calculates the the coefficients of the linear regression equation as follows:

final_score = 1.080*current_score + 1.16*balls_remaining – 4.04*wickets + 17.1

The R2 value is 0.547, which means about 55% of the variability in the data is explained by this model.  The model tells us that the average score from the very beginning of the innings is about 156 (1.16*120 + 17.1).  As the innings progresses, we can feed in more information regarding the current score and wickets to get a more refined prediction.  The model also tells us that a wicket will save about 4 runs for the bowling team.  Being a linear model it doesn’t tell us at what stage of the innings losing wickets is most costly.  It’s also not particularly self-consistent – it can give predictions that are lower than the current score in extreme cases.

KNeighborsRegressor is an algorithm that I used when developing the Expected Runs and Wickets models.  For any given ball, the algorithm searches for a specified number of the most similar balls and from those returns the average final score.  The Python implementation is as below.

Screen Shot 2017-04-18 at 23.58.59

I found that 26 neighbours was the optimal number that would give the lowest error.  Fewer than 26 and you’ll suffer from small sample sizes whilst any greater and the neighbours start to become a bit too dissimilar.  This method had an R2 value of 0.580, which is slightly better than what we had for linear regression.  The disadvantage however is that we don’t get a interpretable equation or rule – we merely input the details of the ball and it outputs a prediction.

A similar algorithm is RadiusNeighborsRegressor.  Instead of searching for a fixed number of nearest neighbours, this algorithm finds all neighbours which are within a certain distance.

Screen Shot 2017-04-19 at 12.13.28

To give some perspective, the most popular unique score after the 6th over is 49 for 2 from 7.1 overs, which has occurred 62 times in total.  If we take the average final score from those 62 innings we can be reasonably confident that this is a good prediction.  If we loosen our requirements to allow any of current score, wickets and overs to be off by a maximum value of 1 (the radius I chose), e.g. 50 for 1 after 7.2 overs, we get 402 similar instances.  In fact, of the 55,781 unique combinations of current score, wickets and balls remaining, 44,536 have at least 10 close neighbours.

This method is ever so slightly better with an R2 value of 0.607.  We do however fail to predict with much accuracy the somewhat extreme cases.  A score like 187 for 2 with 22 balls to go has only 4 close neighbours.  RadiusNeighborsRegressor will take the average final score of those 4, while KNeighboursRegressor will find 26 quite dissimilar neighbours.  Both methods will likely have high margins of error in this case.

rnrThe graph above shows how confident our predictions become as we progress through the innings using the RadiusNeighborsRegressor method.  Early on, we have very little information to base our predictions on – the best we can do is give the historical average final score.  We become 80% confident at around the 13th over and 95% confident with about 2 overs to go.  It’s true that a lot can happen in the last couple of overs, but over a large number of games any differences tend to average out.

The final algorithm we’ll take a look at is RandomForestRegressor.  This is an example of a ensemble method which takes a number of slightly different prediction models and combines them in such a way that the overall performance is better than any one model on its own.

Screen Shot 2017-04-19 at 19.59.21

The algorithm combines 1,000 models or estimators, specifically decision trees, and considers all three features when looking for the best split.  It had an R2 value of 0.607 – very similar to the RadiusNeighborsRegressor algorithm.  However, this algorithm has the advantage that it can tell us about feature importance – which of the attributes is most significant in predicting the final score.

Screen Shot 2017-04-23 at 14.33.51

The table above shows that the current score is the best predictor of the final score.  It is about as informative as both the wickets and ball remaining values combined.  This of course makes sense – good luck predicting the final score from just the number of wickets and balls remaining.  It also perhaps confirms the long suspected notion that wickets in hand are overvalued.  You’ll often hear that a team that scores 180-3 from 20 overs has left a fair few runs on the table.

We can take a look at how the predictions from the random forests model perform on a real game – an IPL match earlier this season between Gujarat Lions and Kolkata Knight Riders.


The graph above shows the model’s predictions for the Lions’ first innings.  A rolling mean of 6 balls has been applied to smooth out the plot.  The model’s predictions never varies from the final score by more than 8 runs.  In contrast, the run rate projected score is a lot more erratic and only briefly predicts anything close to the final score.  The model barely dropped below 180 even after the Lions’ slowdown in between the 10th and 16th overs.   It had faith in their ability to accelerate in the latter stage of the innings, something which the run rate projections does not take into account.

To sum up, we’ve​ seen how we can take some simple facts about the current state of the game and look back in the vast array of ball-by-ball data to generate fairly accurate predictions of the final score of the 1st innings.  In the next article I will be using a similar suite of machine learning algorithms to predict the winner of the match during the 2nd innings.

Chris Gayle – a statistical analysis

In an excellent article in The Independent recently, it stated how in last year’s World T20 final against the West Indies, England decided to open the bowling with Joe Root because Chris Gayle had ‘a poor record against off-spin’.  Although the idea worked it made me think that surely these plans are more rigorously justified beyond seeing which type of bowlers dismiss him most often.  Root is a right-arm off-spin bowler just like say India’s Ravichandran Ashwin, but over a long period of time would probably come off worse against Gayle.  I’m sure England and other teams develop more detailed bowling plans that include what kinds of lines and lengths to bowl to particular batsmen, when to vary their pace and where exactly to position fielders among other things.  All of this can be derived from data: what kind of deliveries do batsmen generally get out to, what areas of the ground do they target etc.

Chris Gayle in T20’s

In this post, I want to use the metrics I’ve been developing and some visualisations to build up a statistical profile of a particular batsman in T20’s, in this case Chris Gayle himself.  It goes without saying that Gayle has a phenomenal record in this format; closing in on 10,000 runs from over 270 innings, with an average of over 40 and a strike rate of 150.  The plot below shows how Gayle’s average has fluctuated over the course of his T20 career.gayle_averageEarly on in his career, his average somewhat dipped to about 30 before there was a resurgence in his batting from his 50th match until his 100th.  He’s since been averaging in the mid 40’s.

The histogram below shows how his career scores are distributed, split up into 5 run bins.gayle_hist

It’s obvious that we’re not going to see a devastating Gayle innings in every match, having made 78 50+ scores in 273 innings.  In fact it’s more likely that we’ll see a failure from Gayle having been dismissed for single figures 84 times in his career.  This is not entirely surprising for an opener in T20’s but it does show it’s not impossible to dismiss him  cheaply.

We can break this down further and see how he performs at different stages of the innings.  The plot below shows the total number of runs he’s scored in each over of a T20 innings.overs.pngIt’s evident that the bulk of his runs come in the Powerplay overs before dropping off through the rest of the overs.  This is due to the fact that it becomes more and more unlikely that he is actually still out in the middle in the later overs.  To account for this we can plot his average per over.ave_over.png

Apart from the 11th over and the last 2 overs, Gayle’s lowest average comes in the 3rd over.  In fact, this is the over where he is most frequently dismissed – a total of 28 times.  It’s peculiar to see that he averages nearly 100 in the 10th over which then suddenly drops to about 30 in the next over.  I’m not really sure why.sr_over

His strike rate generally increases through the course of the innings after the initial blitz in the Powerplay.  The 1st over is the only time when his strike rate is below 100 suggesting he tends to be quite circumspect at the start of his innings. in.png
The graph above shows the average number of balls it takes Gayle to reach a particular score.  It illustrates his incredible ability to accelerate during an innings.  It takes him on average 10 balls to score his first 10 runs.  After that his 50 comes up in about 32 balls and if he gets to a century, it usually take just over 50 balls.  Of course, sample sizes get quite small at that point so the graph becomes a little more erratic.

Gayle’s xR and xW

Now, we can look at what type of bowlers Gayle performs best against or otherwise.  This article describes quite well exactly this.  From a sample of 3 IPL seasons it shows that Gayle thrives against left-arm slow bowling striking at nearly 3 runs a ball.  However, his run-scoring is somewhat restricted by right-arm fast and off-spin bowling, going for just over a run a ball.

We can go further and see if there is any variation in Gayle’s expected runs and wickets against both spinners and seamers.  I use a dataset made up of the last 5 IPL seasons consisting of 70,217 balls that produce 89,329 runs and 3,612 wickets.  Gayle’s IPL average and strike rate is not significantly off his career figures, so just using this data is sufficient for our analysis.  As before, I train the data using a machine learning algorithm to return a xR and xW figure for every ball.  The table below summarises the results.


So Gayle over-performs against both seam and spin bowlers compared to the average batsman.  Although he performs better against spinner than seamers, he is dismissed more often than expected.  This suggests that he takes a more hit and miss approach against spinners but balances risk and reward well against the seamers.

Where to bowl (and not bowl) to Gayle


The beehive plot above shows Gayle’s dismissals in the IPL since 2012, split by seam and spin bowlers.  Spinners tend to dismiss him by bowling fairly straight while seamers tend to go wide of off stump or very short.  Of course this doesn’t tell you the full story.  We can also take a look at where to bowl to keep Gayle quiet.


The heat maps above show the distributions of 460 and 186 dot balls to Gayle from seamers and spinners respectively.  Seam bowlers give themselves the best chance by bowling back-of-a-length outside off, although the distribution is quite broad in terms of both line and length.  For spinners, keeping it very tight to the top of off stump is the way to go ensuring you’re not too full or too short.  Of course, these plans are fraught with risk as the next images show.


The heat map above shows the distribution of 227 balls that have been hit to the boundary by Gayle.  It’s clear that if you bowl fuller and outside off, you’re very likely to be hit to the boundary.  If you’re wondering what happens if you bowl straight to Gayle as a seamer, then this is where he is mostly restricted to 1’s, 2’s and 3’s.spin_4_6

For spinners the margin of error between dot balls and boundaries is even smaller, comparing this to the spinner’s dot ball heat map above.  The figure above illustrates the 92 boundaries Gayle has hit off spin bowling.  If you bowl fractionally too full and outside off stump then you are in trouble.  This confirms the hit and miss nature of bowling spinners to Gayle implied by his xR and xW figures.  If you want to bowl spin to him then you have to be prepared to go for runs before his is dismissed.

Gayle vs particular bowlers

We can now look at how specific bowlers who have been successful (or otherwise) have bowled to Gayle.  The table below shows the 10 bowlers he has performed worst against in terms of runs/xR and who have bowled at least 18 legal deliveries to him.


We get a mixture of both seam bowlers and spinners.  Gayle scores nearly 26 runs fewer than what we would expect the average batsman to score when facing Lasith Malinga.  We have to be careful with the interpretation here however.  The average bowler would expect to concede 44 runs if they bowled the exact same deliveries as Malinga has.  But if Malinga (and other bowlers on the list) concede less than expected against most other batsman then there is something about these bowlers our model doesn’t quite capture.

The number of dismissals per bowler isn’t really large enough to form concrete conclusions from the dismissals/xR ratios, but it should be noted that he is dismissed by the spinners Sunil Narine and Harbhajan Singh more often than expected.

Let’s look at the heat maps from some of these bowlers with their wickets shown in red.malinga.png

Firstly, Malinga has two distinct areas that he bowls to Gayle mixed in with the occasional short ball.  He favours a good length quite wide of off stump and also some yorker length deliveries aiming for the base of middle stump.


Steyn bowls more or less evenly between back of a length on off stump and a bouncer length towards Gayle’s helmet which has gotten his wicket once.


Narine’s heat map shows he bowls wide of off stump rather than most spinners who target top of off.  His length is also incredibly consistent shown by the very narrow contours, relying on variation in spin.


Ashwin, on the other hand, varies his line and length a lot more to Gayle.  He predominantly bowls quite full on off stump which, if you remember, is where Gayle hits a lot of boundaries against spinners.  Perhaps this suggests Gayle is relatively more cautious when facing Ashwin.

Finally, we can take a look at a bowler who hasn’t fared quite as well.bhuvi.png

Bhuvneshwar Kumar has gone for 78 runs from 51 balls with a runs/xR figure of 1.22 against Gayle.  He mostly bowls in that area where seamers typically keep Gayle quiet.  However, when he misses his length he goes for a lot of boundaries shown by the blue balls.  This again stresses how little margin for error you have when bowling to Gayle.

Setting a field to Gayle

Another component to restricting and ultimately dismissing Gayle is field placement.  The wagon wheel below shows 6,037 of his runs from 2,260 scoring shots.


It’s obvious from watching him that a lot of his boundaries come in the deep midwicket to long-off region.  He also doesn’t run many 2’s or 3’s.


His dot balls mainly come from deliveries played back to the bowler.  Having a fielder close in on the off-side is also a source of quite a few dot balls, as well as conventional point and cover fielders.  I assume all those dot balls on the boundary are when he’s turned down singles although I’m not entirely sure he’s done it that many times.

Gayle is caught 64% of the time he is dismissed.  The wagon wheel below shows 85 instances of him being caught in the outfield separated by seam and spin bowling.


He is caught behind and in the slips quite often to both seamers and spinners, suggesting it is worth having a slip in, especially early on in his innings, even in T20’s.  Given where he scores most of his runs, it’s only a matter of time before he mishits one and gets caught at long-on or deep midwicket.

Data needs context

Overall, we’ve seen how we can use data to identify the strengths and weaknesses of a batsman and thus formulate bowling plans.  However, it’s important to note that we haven’t found the silver bullet to dismissing Gayle cheaply every time we bowl to him.  Just because he gets out to a particular type of delivery really often, doesn’t mean teams should focus their entire pre-match training on hitting that one spot.  We know that variation is important in T20’s.  Looking deeper, we might find it’s a particular string of deliveries that set him up before dismissing him, or that he is dismissed this way after getting a big score.  As ever, more investigation is required.

Any comments/questions?  Tweet me here.

India v England T20I series analysis

In this blog I wanted to take a brief break from refining our Expected Runs and Expected Wickets models and instead use these metrics, in their current form, to analyse the recent India v England T20I series.  India came back from 1 nil down to win the series 2-1.  The table below shows a summary of the series in terms of the total xR and xW for each team in each match.  To give some context, batsman outperformed in this series overall, scoring 875 runs compared to a total xR of 828.8.  In contrast, there were 35 wickets lost (excluding run outs) compared to a total xW of 34.81.


In the first match, England bowled reasonably well to restrict India to 147-7 in their innings.  This was 6 runs below their total xR suggesting their relatively low score was due to poor batting more than good bowling.  In reply, England outscored their total xR by nearly 26 runs.  This, along with losing nearly 2 wickets fewer than expected suggests a pretty good batting performance overall.  In the next match, both teams performed near enough as expected in terms of both runs and wickets.

In the final match, India significantly outperformed their xR to post a total of 202-6.  While England did too, it was nowhere near enough to challenge India’s score.  Slightly outperforming an already low xR total won’t win you many games.  The most damning aspect however, is England losing more than double the expected number of wickets.  In fact, their xW for that match was the lowest of the series highlighting just how inept that batting collapse was.


We can look at how individual batsmen performed throughout the series.  The table below shows some statistics for the top 10 series run scorers.


Although Joe Root top scored in the series, he did under-perform according to his xR.  However, he did have the lowest wickets/xW of any batsman suggesting he had to dig in at times.  He was dismissed only twice even though India bowled well enough to him to dismiss an average batsman more than 5 times.root_runs.png

The beehive plot above shows how India bowled to Root in the series.  I think there was a definite plan to bowl very straight and wide of leg stump, with fielders out on the leg side boundary.  This seemed to work as he was mainly restricted to singles in this area and ended up with an overall strike rate of just over 100.    root_wickets

The heat map above shows the extent of India’s bowling plan along with Root’s 2 dismissals.  The darker the green the greater the number of balls bowled in that region.

MS Dhoni had a similar story to Root – scoring below xR, but batting well enough to survive periods of good bowling from England.


This heat map shows an even more pronounced plan from England to bowl at the top of leg stump and occasionally full and outside off stump to Dhoni.  The purple balls shows all of Dhoni’s boundaries which accounted for 44 of his 97 runs in the series.  His overall strike rate of 139 suggests this plan didn’t quite work for England, in contrast to India’s plan to Root.

Also from the table above, it seem Sam Billings had a pretty poor series after giving his wicket away nearly 3.5 times more than expected.  The beehive plot below shows his dismissals and every other ball he faced in the series.


It shows how we was undone by the surprise bouncer from Ashish Nehra in the 2nd T20I, which had an xW of 0.136 – the highest he faced.  His other 2 dismissals had xW’s of 0.045 and 0.034.  He faced 8 balls which had higher xW’s from which he scored 17 runs.  Although he is an opener, this perhaps shows his need to work on picking the right ball to hit.

England vs spinners

Another theme to come out of this series was, predictably, England’s poor batting performance against spinners.  We can see if this is borne out with our xR and xW metrics.  The table below compares England’s xR and xW for both seamers and spinners.


England produced an average performance against seamers, scoring and losing wickets more or less in line with expectation.  Against spinners, they over-performed in terms of scoring runs but lost nearly 3 more wickets than expected.  If we exclude Suresh Raina from India’s list of spinners, England’s runs/xR drops to 1.064 and wicket/xW increases to 1.63 i.e. nearly 4.3 wickets more than expected.  This can only confirm how dreadfully poor some of England’s shot selection was in this series against spinners.wick_r


The heat maps above show how India’s spinners bowled to right and left-handed batsman with their wickets in red.  The dark green patches tend to be wide but quite flat.  This suggests that their spinners get quite consistent bounce and rely on movement off the pitch for variation.  We can compare this to England’s spinners, again bowling to right and left-handed batsmen.


Interestingly, the heat map is more narrow and less flat.  This suggests the England spinners relied more on variation in length rather than line.

Overall, we’ve shown how the xR and xW metrics, together with some visualisations, can be useful in a post-match analysis.  We can make use of this data to confirm or challenge any conclusions that commentators or we as viewers may make about team or player performances.  As I refine these models further, I’ll be sure to do some more match analyses in between to make sure the metrics remain valid.

If there’s anything you think I could add, let me know here.

Adding game state to xR and xW

In previous blogs I have described how I used the speed of the ball, and its line and length to calculate the Expected Runs and Expected Wickets of that ball.  In this blog, I incorporate game state into these metrics, i.e. the over of the innings the ball was bowled.  The run rate, and indeed the probability of getting a wicket, is not constant throughout the course of a T20 innings.  Using data from nearly 4,000 T20 matches, we can calculate the average number of runs from each over of a innings, shown in the graph below.run_rates.pngTeams generally have a bit of a go in the powerplay overs then completely start over in the 7th over – something I’ve always found a bit peculiar.  Usually, a spinner comes on and they knock him about in that over.  If teams target this over when the fielders have just recovered from the powerplay, perhaps they can increase their average scores by 2 or 3 runs – enough to win maybe 5% more games?  Anyway, the average number of runs then increases in a pretty linear fashion.

The point is that the run rate fluctuates and xR can be adjusted to reflect that.  The historical run rate in T20 matches is about 6.98.  This is the benchmark that will be used to set the ‘value’ of overs.  For example, on average there are 9.07 runs scored in the 20th over, so the xR of balls in this over will be multiplied by 9.07/6.98 = 1.30.  In other words you would expect 30% more runs to be scored from the exact same balls than in the 13th over where the multiple is about 1.  Similarly, the multiple of the 1st over is 0.71 i.e. 4.95/6.98.

The average number of wickets in an over follows a similar shape.w_rates.pngAgain we see that dip in the 7th over where both sides just drop in intensity a bit.  Toward the end of the innings the wickets tend to fall at an exponential rather than linear rate.  As before, we can adjust xW to account for this variation.  The historical average wickets per over is 0.318.  So the xW of balls in the last over for example, are multiplied by 0.80/0.318 = 2.52.  However, this does not ever result in the xW of a particular ball exceeding 1.  In fact the highest xW of a single ball is 0.704.

We can now see how this affects our batsmen and bowler ratings.  The correlation between the old xR and updated xR is very significant with an R2 of 0.996, so we would expect few changes.  The table below shows the best and worst performing batsmen by runs/xR with over 300 runs.


The best batsmen according to the updated metric is once again Glenn Maxwell followed by Aaron Finch.  However, Darren Sammy, who was previously in 3rd position, has dropped all the way to 33rd.  This is a reflection of his under-performance in scoring runs near the end of the innings.  This may also be true of MS Dhoni whose xR/runs figure drops from 1.00 to 0.889 – the equivalent of about 76 runs.  Dhoni is a great finisher of an innings but perhaps picks the right ball to hit a little too conservatively.

An example of a batsman who sees an increase in their runs/xR multiple is David Warner, from 1.14 to 1.21 – the equivalent of 56 extra runs.  This perhaps illustrates the fact that he performs better in the first few overs of the innings than other openers and top-order batsmen.  If you look at the first graph above, you’ll see that of the first 12 overs of an innings, only 3 (overs 4, 5 and 6) are above the average run rate of about 7.  This means batting in this period will result in your xR to be discounted as it were.  Other examples include Aaron Finch and Chris Gayle who this updated model rates more favourably.  On the other hand, Luke Wright’s runs/xR drops ever so slightly from 1.13 to 1.12.

For bowlers with at least 200 balls, the top and bottom 10, measured by xR/ball, are shown below.  It should be noted again that spinners historically have had a lower economy rate than pace bowlers in T20 cricket.  This coupled with the fact that spinners tend to bowl in the middle overs means that this updated model favours spinners a slightly more than pace bowlers.


As expected, spinners make up the bulk of the top performers list.  The best performing fast bowler is South Africa’s Lonwabo Tsotsobe with an xR/ball of 1.16.  One important difference to the previous model is the fact that the xR/ball figures have all dropped for bowlers in the top 10 and increased for bowlers in the bottom 10.  Mohammad Hafeez’s xR/ball has gone from 1.08 to 1.00 while Dwayne Bravo’s has gone from 1.40 to 1.50.  This shows how the new model can be useful in identifying the best bowlers at certain stages of the innings.

Furthermore, we can see how the updated xW model affects both batsmen and bowlers.  You can see from the second graph above that the wicket rate for every over up to the 14th is below the average wicket rate, so the xW of any balls in that period will be discounted and vice versa.  The best and worst performing batsmen, with at least 15 dismissals, measured by dismissals/xW are shown below.


This time JP Duminy is top after adjusting xW for game state.  At the other end, surprisingly Shahid Afridi drops down to 4th worst with Michael Lumb taking his place.  Interestingly there are quite a few openers in the bottom 10 compared to the previous model even though this model favours batsmen who bat in the opening overs.  This possibly highlights the hit-and-miss nature for openers in T20’s.  Perhaps an extension to this model would include adjustment factors for different positions in the batting order.


Expected Strike Rate (xSR) is calculated from the number of balls divided by xW.  In general, pace bowlers have had a lower strike rate than spinners in T20 cricket so it is not surprising to see only fast bowlers in the top 10 and mostly spinners in the bottom 10.  Tsotsobe however, who we saw previously to be the best performing pace bowler in terms of runs/xR,  has the highest xSR of any bowler.  This is consistent with his career stats which suggests he is a bowler who keeps it tight in the latter overs without getting a load of wickets.

Overall, the intuition between adding game state to the model is simple – if a player scores more runs or takes more wickets than on average at a particular stage of the innings then they deserve some credit and tells us something about their game.  I realise this article consisted of a lot of tables of data, but I’ll be sure to include some more informative graphics in the future.  In the next post I hope to look into game state further, specifically which batsmen and bowlers perform the best in the powerplay and death overs and why they’re able to do so.

Questions/suggestions?  Tweet me here.

Measuring Expected Wickets

After refining my Expected Runs model in my last blog, I will now apply the same methods to investigate wicket taking deliveries.  I use the same factors, namely where the ball pitched, the line and length when it gets to the batsman and the speed of the ball.  For a particular delivery, the nearest neighbour algorithm finds the 50 most similar deliveries in terms of those attributes.  The number of wickets resulting from those deliveries divided by 50 is the Expected Wickets, or xW, of that ball.

We use the same dataset as before – 41,104 balls from 208 T20I matches.  The algorithm predicts a total of 2,279 wickets compared to the actual 2,298 wickets.  Firstly, we can visualise which kind of balls are most likely to take a wicket.  The image below shows the pitch map and beehive plot  of the 100 balls with the highest xW figures ranging from 0.20 to 0.28.  Red points indicate actual wickets and deliveries to left-handers have been flipped so they can be compared.

bee.pngWe can observe that these 100 balls cluster into 5 distinct bowling areas: yorker length deliveries, balls on middle and leg stump, good length top of off, back of a length outside off and bouncers outside off stump.  Interestingly, the few deliveries which pitched beyond the stumps are full tosses highlighting their canny knack in getting wickets.

The performance of a batsman can be measured by comparing how many times they were dismissed to their total xW, or how many times we would expect the average batsman to be dismissed if they faced the exact same deliveries.  The graph below shows the xW of every batsman in the dataset against how many times they were actually dismissed.  The grey line separates over and under-performing batsmen.  xW.pngThe over-performing batsmen are below the line and those tend to be the ones who have batted (and been dismissed) most often.  It seems the more established batsmen get out less often than what the metric predicts compared to the many tail-enders who hardly bat in T20 matches.

The table below shows the best performing batsmen calculated by the ratio of wickets to xW, for batsmen dismissed at least 15 times.


Virat Kohli comes out on top with a dismissal to xW ratio of just 45%.  This means he is dismissed less than half the number of times the average batsman would if they faced the same type of deliveries he has.  This is a true testament to his ability to negotiate any kind of bowling and still score runs at a good rate.  The other end of the scale looks like this:


Perhaps unsurprisingly, notorious wicket giver-away Shahid Afridi comes out way on top with a dismissal to xW ratio of 1.70.  In other words, he has given away 70% more wickets than the average batsman would have done facing the same balls.  Further down the list are some more notable hitters like Darren Sammy and Kieron Pollard.  Interestingly, AB de Villiers is under-performing with respect to this metric, which may suggest to need to incorporate game state into the model.  It is likely this ratio increases toward the end of the innings as batsmen become more willing to sacrifice their wickets.

It would be pertinent to view the relationship between wickets/xW and runs/xR as these are both metrics which independently rate how good a batsman is.xW_xR.pngThe graph above shows a slight negative trend as you would expect.  The more runs you score above expectation, the less likely you are to give your wicket away compared to the average batsman.  However an Rof 0.185 suggests the relationship is not completely significant.  This does mean the two metrics can be treated more or less independently and therefore be combined in some way to rate a batsman more reliably.

Out of interest, the leader of the runs/xR metric, Glenn Maxwell, has a wicket/xW ratio of 0.979 meaning he is over-performing ever so slightly.

We can also investigate how bowlers fare with this metric.  The table below shows the best performing bowlers (with at least 300 balls) ordered by Expected Strike Rate, or xSR.  This is the number of balls bowled divided by xW.


Mitchell Starc is first with 15.8 balls per Expected Wicket.  Next come 8 more pace bowlers before the first spinner, Ravindra Jadeja.  For reference, the strike rates in all T20I matches for pace bowlers and spinners are 19.0 and 19.9 respectively, so we would expect this split.  The bottom 10 bowlers are as below.


Here we see mostly spin bowlers with the only two seam bowlers in 9th and 10th place.  Interestingly we see Mohammad Hafeez and Sunil Narine in the list, both of whom the xR/ball metric rated highly.  In both lists there are bowlers who take more or fewer wickets than what xW predicts.  This is not something xSR takes into account as the xW of a delivery is only a function of the attributes of that delivery and not the eventual outcome.  Bowlers who happen to take fewer wickets than expected have most likely bowled to more capable batsman more often and vice-versa.

I hope this article has demonstrated the usefulness and validity of the Expected Wickets metric in rating the performances of batsmen and bowlers.  In future blogs I will see how this complements the Expected Runs metric and hope to include more variables to refine the model further.

As ever, if you have any questions/suggestions please feel free to tweet me here.

xR with machine learning

In my first blog I described my first attempt at a metric called Expected Runs or xR.  This calculated the average number of runs a batsman would score off a delivery based on the ball’s line and length as it reaches the batsman.  In this blog, I extend this model by including other factors such as the speed of the ball and where it pitched.

I recently used a machine learning algorithm called k-nearest neighbours to develop a Expected Goals model.  This worked out pretty well so I thought I’d apply this to my cricket data.  Put simply, this algorithm finds the k (I used 50) most similar deliveries to a particular delivery and calculates the average number of runs accrued from those.  This means that it is not necessary to split bowlers into spinners and fast bowlers, as balls bowled at a typically spinner’s or seamer’s pace are likely to be grouped together.

The algorithm works by randomly splitting the dataset into a training set (from which the patterns are identified) and a test set which is a fifth of the original dataset.  This is done 10 times to cross-validate the data and reduce the effect of any bias in the training set.  For anyone interested the Python code is below:


From the ball-by-ball T20I data I removed any wides and erroneous deliveries.  In contrast to my first xR model I decided to mirror flip any deliveries to left-handers so that they can be equivalently compared to right-handers.  This left 41,104 balls from 208 matches.  After applying the algorithm as above, the total xR was 50,978 runs compared to 50,923 actual runs – a 0.1% error.  This is not surprising at all given that the algorithm learns on itself.  Almost by definition, the total actual runs and xR must more or less match.

As before, we can calculate the xR of individual batsmen.  The plot below shows each batsman’s careers runs against their xR.  The straight line is y=x and acts as the dividing line between over and under-performing their expectation.xrMost of the highest scoring batsmen are over-performing which perhaps means that there is an aspect of their game we are not capturing with this metric.  One possible factor is boundary hitting.  The runs/xR ratio for a particular ball can take values up to and above six.  As such, batsmen who hit a lot of boundaries can inflate their performance with this metric, not that boundary hitting is a bad thing.  The usefulness of xR comes from identifying batsmen who can score more runs from deliveries than on average and this includes rotating the strike and hitting into gaps for two’s and three’s.  The graph below shows the total number of boundaries for each player against their performance figure, measured by runs/xR.blog.png

Although there is a slight positive correlation, the R² value is just 0.119.  This means that total boundaries only explains 11.9% of the variance in the performance measure.  The R² drops to 0.093 if we only include batsmen who have scored at least 20 boundaries, indicating that it is even less significant for the better batsmen.  The table below shows the top 20 performing batsmen with at least 300 runs (of which there are 52).

As before, Glenn Maxwell comes way out on top over-performing his xR by 42%, followed by his Australian teammate Aaron Finch.  Finch and Sammy have swapped places from the list in the my first xR model.  Some other notable changes include Shahid Afridi dropping from 5th to 12th in the list, and Mahela Jayawardene breaking into the top 10.

At the other end of the scale, the table is as below:


This time Martin Guptill comes bottom of the pile instead of Mohammad Hafeez who is in second place.  It is reassuring that the runs/xR figures themselves have not changed drastically.  In this and the previous model, they are mostly between 0.9 and 1.2.

The tables below show the 10 best and worst performing bowlers who have bowled at least 200 balls (there are 60 such bowlers in the dataset).


In my previous blog about xR, I questioned Hafeez’s place in the Pakistan T20 side based on his batting stats.  But it appears his bowling justifies his inclusion as he has the lowest xR/ball in the list.  As explained in that post, bowlers who concede more runs than expected are most likely unlucky enough to bowl to quality batsmen quite often.  Sunil Narine slips down to 8th, although he still concedes over 100 runs fewer than expected.

Spinners dominate the top of the list with the best performing pace bowler, Bhuvneshwar Kumar, coming in at 18th place with an xR/ball of 1.205.  This is not entirely unexpected, as spinners have an overall economy rate of 6.89 in T20I’s compared to 7.70 for pace bowlers.  The full lists for both batsmen and bowlers can be obtained here.

In future blogs I will be extending this analysis to ODI’s and Test matches and investigate xR at a match level.  I’ll also be experimenting with other machine learning algorithms to predict both runs and wickets.

If you have any suggestions or ideas of your own, please feel free to tweet me.

Expected Goals using machine learning

In my last blog I built two simple xG models using the distance of a shot from the goal line and the angle a shot makes with the crossbar.  The second model attempted to account for the fact that, for the same distance, shots from wide areas are less likely to be scored than shots from right in front of goal.  However, this was not perfect as it still undervalued shots from directly in front of goal closer to the penalty spot.  Ideally, our xG visualisation should look like this (courtesy of David Sumpter) with high xG shots directly in front of goal and low xG shots in the wide areas but still close to the goal.  0-q0teyhg1hhrczccoThis can achieved by combining both distance and angle to goal into one function.  But this is not a trivial problem as described in detail by Michael Caley.  Through trial-and-error, he manages to capture the above distribution pretty well using a combination of distances and angle and inverse distances and angles.

Another approach is to consider clusters of shots.  A set of shots that are geometrically close together will have similar distances and angles so should therefore have a similar xG.  This is the concept of binning as described in this model.  The number of goals divided by the number of shots from inside a particular bin gives us the xG for those shots.


My approach uses a k-nearest neighbour algorithm which is a simple machine learning technique.  For a particular shot, the KNeighborsRegressor Python package will search for the 500 closest shots and then count up the number of goals.  This number divided by 500 is the xG for that shot.  The image below is a visualisation of this model showing all 226,917 shots in the dataset.  xG4.pngThis seems much more like the xG distribution from above.  However, the xG of shots in the 6 yard box right in front of goal is about 0.8 whereas my previous models predicted values closer to 1.  This is a result of the high value for k.  The 500 nearest neighbours to shots near the goal line will draw upon shots in the edges of the 6 yard box due to the relatively low density of shots.

There are 21,578 goals in the whole dataset.  This model predicts 21,409 goals which is an error of just 0.8%.  We can investigate how teams and players perform under this metric.  The graph below shows the total xG for each team in the dataset against the total goals they scored.  The line is y=x and shows over-performing teams above the line and under-performing teams below the line.  teams_xgThe really good teams over-perform meaning there are aspects of their play that isn’t captured by a shots-based xG model.  These teams ordered by their performance, measured by total goals divided by total xG, are shown below.


Similarly, we can assess players using xG.  The graph below shows xG against total goals per player.players_xgAgain it seems the best players over-perform their xG numbers.  A breakdown of players with at least 60 goals is shown in the table below.


Luis Suárez is over-performing his xG by a huge 72%, closely followed by Messi and Griezmann at 70%.  It is interesting to note that Benzema has an xG per shot of 0.154, the highest of the players in the list.  This, coupled with his high over-performance figure, suggests he is finishing really well and getting into high value positions to shoot.  Meanwhile, his Real Madrid teammates, Ronaldo and Bale have the lowest xG/shot.  It seems they have their share of speculative efforts before Benzema comes in to clean up if they fail.

It is reassuring that this xG model identifies the best teams and players, validated by other models.  The machine learnining algorithms can of course be extended to include many other factors.  In future blogs, I’ll look to build these into my cricket models.

A simple Expected Goals model

Whilst this blog will focus on cricket, a lot of concepts will be inspired by work from other sports.  For example, my Expected Runs model describes the average number of runs that would be scored from a delivery with particular attributes such as its line and length.  This is equivalent to the well established Expected Goals (xG) metric from football which measures the quality of a chance.  A simple search will glean many models that take into account a number of factors such as the distance and angle from goal, which body part was used, build-up play etc.  Using some data that I had lying around, I wanted to see if I could replicate the results of some of these models.

Almost all xG models are fundamentally based on the location of the shot.  In fact, this blog suggests that you can construct an xG model that is 95% reliable just by considering the location of shots.

The image below shows the locations of 234,018 shots from 9,133 matches I have in my database from various leagues and competitions around the world (attacking left to right).all-shotsThe cluster of shots in the defending penalty area are not opportunistic goalkeepers but own goals.  This is illustrated by the next image which shows all 24,347 goals in the dataset.all-goalsJust from these images we can immediately start forming some hypotheses such as the closer and more central you are to the goal the more likely you are to score.  Compare the density of the cluster of shots outside of the penalty area to the goals from this area.  Also, consider that the cluster inside the penalty area extends to the edges of the box, while the number of goals in the wide areas is significantly fewer.

For the purposes of my simple xG model, I excluded own goals and penalties and only considered shots in the boxed area show below.df-shotsThis leaves us with 21,424 goals from 225,372 shots.  I calculated the distance of each shot from the middle of the goal line giving us shots that ranged from 0 to 40 metres.  I then placed all these shots into forty 1 metre bins.  Within each bin, the ratio of goals to shots is the expected goals figure or xG for each of those shots.  For example, a shot from 10.5 metres out has an xG of 0.129 because there has been a total of 9,267 shots between 10 and 11 metres out resulting in 1,194 goals.

If we do this for every shot, we get the following graph:graph

We can see that a shot from less than 1 metre out is more or less certain to be a goal.  Beyond this, there is a rapid drop in xG up to 10 metres before slowly converging to zero at large distances.  This sort of graph can typically be fitted by an exponential decay curve.graph2

By eye the curve seems like a decent fit for our data.  We can use the equation to estimate the xG of a shot from just its distance from the goal.  For example, d = 10.5 metres gives us an xG of 0.157 goals.  This is slightly higher than the empirical result from earlier.

The equation, however, breaks down at very small distances.  Specifically, any distance below 0.7 metres will return a value greater than 1 for xG.  Nevertheless, the r-squared value is 0.987, similar to what is obtained here.

Here is a visualisation of the xG model.xG2.png

Another limitation of this simple distance model is that it rates two shots of the same distance with the same xG, even if one is from directly in front of the goal and the other from a very tight angle.  Our intuition tells us that xG should decrease as we get less and less of the goal to aim for.  However, an xG model based just on angle to goal will also suffer from a similar problem.  The xG of a shot from the penalty spot will be the same as that of a shot from the centre mark.  Both have the same angle from goal i.e. zero, but are vastly different distances from goal.

To counter this, we can consider the angle to the middle of the crossbar.  The figure below illustrates this angle.


Let’s say position A is the location of the shot and AB is the distance to goal as before.  BC is the height of the goal i.e. 2.44 metres.  From these two lengths we can calculate the angle to the crossbar, θ, using basic trigonometry.  This angle has the advantage that it decreases with distance assuming the angle to the goal line stays the same.

As before, we can calculate this angle for every shot in the dataset and place these into bins to obtain the xG for a particular angle.bar.pngThis time we get a linear relationship between the angle to the crossbar and xG.  As you might expect, the more of the goal you have to aim for and the closer you are, the higher the chance of scoring.  There is a bit more uncertainty in xG in shots above 30 degrees as the vast majority of shots (99%) have an angle less than this.

This blog has described two simple implementations of an expected goals model based purely on the location of the shot.  In the next blog, I will attempt to build a model that incorporates both the distance and the angle to the goal as well as investigating how teams and players perform under this metric.

Rating players with xR

In my last post, I described a metric called Expected Runs or xR for short.  This gave us the average number of runs you would expect a batsman to score from a delivery that possesses particular attributes such as its line and length etc.  My first attempt at an xR model just considered the position of the ball as it reaches stump level using data from over 200 T20I matches.

In this post, I look at how batsmen perform under this metric over several T20I matches.  The plot below shows the total runs scored for 507 batsmen in my database against their total xR.  Batsmen above the line score more runs than what xR suggests so are over-performing according to this metric.  blog.pngWe can calculate over-performance by dividing xR by runs.  The table below shows the top 20 batsmen, with at least 300 runs (of which there are 55), ordered by runs per xR.


Glenn Maxwell comes out way on top, scoring over 200 runs more than the average batsman would if they faced the same deliveries as he had.  Further down the list we see some notable hitters such as Aaron Finch and Shahid Afridi.  Interestingly, Luke Wright comes in higher than the likes of Chris Gayle, AB de Villiers and David Warner.

At the other end of the scale we get the table below:


Mohammad Hafeez scores nearly 100 runs fewer than expected.  This suggests he is not putting away the bad balls enough of the time, which is not ideal for an opener batting in the Powerplay overs.  It’s a wonder why Pakistan persisted with him for so long considering he has a career average of just 22.73 and a strike rate of 115.

The xR beehive plots from my last post, show that xR for a particular patch is rarely above 1.5.  Given that a boundary can produce a runs/xR multiple of up to 6 for that particular ball, I wanted to see if frequent boundary hitters generally had a higher runs/xR figure.  Taking batsmen to have scored at least 20 boundaries, we can see whether there is a trend between the number of boundaries hit and the runs/xR multiple.blog.pngThe plot above shows that the correlation is not very strong with an R² value of 0.068.  This is encouraging as it implies xR measures something more than just pure power hitting.  It can be used to identify the batsman who have the ability to hit good balls into gaps for ones and twos as well as those batsman who are not good enough to consistently put away bad balls to the boundary.

xR can also be applied to bowlers.  Bowlers with a low xR/ball figure are bowling in areas that are on average low-scoring.  Note that this is independent of what the batsman eventually does.  The table below shows the bowlers to have bowled at least 300 balls (of which there are 34) ordered by their xR/ball multiple.


Perhaps unsurprisingly, Sunil Narine comes out on top with 1.202 xR/ball.  The average run rate per ball across the entire dataset is 1.245.  This is a difference of 5 runs across a 20 over innings, so certainly not insignificant.  It is interesting to note that Narine’s xR conceded is significantly higher than his actual conceded runs.  This suggests that many batsmen are under-performing when facing him even when accounting for the fact that he bowls a lot of good balls.  Batsmen cannot seem to consistently hit him for ones and twos, never mind boundaries  – a testament to his incredibly tight bowling.

Darren Sammy and Sohail Tanvir are the only two bowlers in the top 10 to concede more runs than expected.  This may be due to a combination of mainly facing above-average batsman and some bad luck.

At the other end of the scale we observe that every bowler in the bottom 10 is a seam bowler bar Imran Tahir – an indication of the need to have separate xR models for spinners and seam bowlers.  Kyle Abbott has the highest xR/ball corresponding to 9 runs more than the average T20I innings.  Two fast bowlers, Mitchell Starc and Lasith Malinga, concede significantly fewer than expected.  Although they bowl in relatively high-scoring areas, their pace may be a factor in keeping runs to a minimum.  Again, this is something that can be built into the xR model.


The full list for both batsmen and bowlers can be found here.

xR has certainly shown its potential in accurately rating players beyond traditional metrics.  xR can also be used to rate individual innings as well as determine who ‘deserved’ to win a match by calculating the total xR for each team.  In future posts, I will look to incorporate more factors to further refine the model, including what balls are most likely to get a wicket.

Introducing Expected Runs

How can you tell how good a batsman really is?  How can we measure their true skill?  Maybe they’ve been getting lucky or have been facing some pretty dross bowling?  Perhaps a batsman appears to be out of form because they’ve recently been on the end of a couple of unplayable deliveries early on in their innings.

Averages and strike rates are good summary statistics but reveal little about the current match situation or the quality of the opposition bowling.  In this blog, I describe my first attempt at a metric that aims to predict the number of runs a batsman should score based on the type of deliveries they have faced.

In football analytics, the metric expected goals, abbreviated to xG, measures the probability of a particular shot ending up as a goal based on a variety of factors such as the distance and angle from goal, the body part used to make the shot and the type of assist.  An open goal from 10 yards usually leads to a goal, while a shot from 40 yards out rarely does.  Adding up all these expected goals gives the number of goals that a team or player would score on average.

Similarly in cricket, the same type of delivery usually end with similar outcomes: half-volleys tend to go for 4, top-of-off deliveries tend to be defended and ripping leg-breaks are often played and missed etc.  This can be quantified using data to calculate the number of runs an average batsman would be expected to score from a delivery of a particular line and length, speed and movement off the pitch among other factors.  For example, exactly how many runs would you expect to be scored from a back-of-a-length delivery outside off, with no movement off the pitch at 85 mph?  If we collect all the deliveries that have these attributes and add up the total runs that have accrued, we can divide this by the number of balls to get an Expected Runs figure, or (predictably) xR for short.  The next time a similar ball is bowled we can say that it has an xR of however many.

This concept is also used by CricViz to measure current batting conditions in Test matches, as described here.

You may immediately see how Expected Runs can be used to measure the quality of a batsman.  If the xR of a particular ball is 1.5 and the batsman is able to consistently hit this ball to the fence, it gives an indication of how good this batsman is compared to the average batsman.

In my first version of an Expected Runs model, I only consider the line and bounce of the ball i.e. the position of the ball when it is level with the stumps.  My dataset consists of 51,775 balls from 226 T20I matches.  The data contain details of the over, batsman, bowler, runs, any extras, wickets, ball speed, coordinates of where the ball pitched and coordinates of where the ball ends up at stump level.  I stripped the data of any wides, null and erroneous coordinate values to end up with 43,541 deliveries producing 54,192 runs and 2,420 wickets in total.  I then split this data by right and left-handed batsman to give 30,757 and 12,784 deliveries each.  Every ball in the dataset is shown below as a beehive plot:


The batting crease runs from -1.5 m to 1.5 m with middle stump at 0 m.  I split the coordinate space into square bins of 0.1 m side length, giving us 750 bins in total as shown in the figure below.  However, it is evident from the figure above that the sample size for each bin will vary wildly.axesThis procedure found bins to have Expected Runs ranging from 0 to 6.  These extreme figures were due to very small sample sizes.  The table below illustrates how runs compares against xR.  As expected, they both have virtually the same mean but xR has a significantly smaller standard deviation.  It may or may not be surprising that most balls in T20 matches are either dots or singles.


The continuous nature of the xR metric means we can differentiate between good and bad balls more accurately.  We can determine whether a dot ball was a genuinely good ball or whether the bowler just got away with one.  We can say whether a bowler deserved to be hit for a boundary or just got plain unlucky.

The figures below show the results of the binning for both right and left-handers.  Remember, the average runs per ball is about 1.24.  It can be seen that the relatively high scoring areas are anything wide of off-stump and on the batsman’s legs.  Here the xR value is about 1.4 to 1.5 or up to 9 runs per over.  To restrict the batsman to about a run a ball, the data shows that you should bowl a good to back-of-a-length on off-stump or just outside.  The blank bins indicate extreme values for xR due to small sample sizes so were left out.

right-hand batsmen xR
left-hand batsmen xR

Even with this simple model some cricketing truths are apparent, namely not to give batsman room to free their arms or bowl on their legs in T20.  There is certainly a lot of scope to improve this model.  I am yet to incorporate the positional coordinates of where the ball pitched, and any movement off the pitch etc.  I could construct separate models for both spinners and seam bowlers.  I could also consider game state i.e. where is the best place to bowl in the death overs or at particular batsman early on in their innings.

In the next blog I investigate which batsmen fare the best under this metric and whether xR correlates with other measures.

If you have any questions/suggestions please feel free to tweet me: @cricketsavant