*Note: This article was originally published at the Statistically Speaking blog at MVN.com on February 28, 2008. Since the MVN.com site is defunct and its articles are no longer available on the web, I am re-publishing the article here.*

* *In Part 1 of this series, we examined Brian Bannister’s suggestions for why he has been able to beat the league BABIP. He indicated that it was probably due to pitching more often in favorable pitcher’s counts and inducing balls in play with two strikes, when the hitter is against the ropes. However, the evidence didn’t show much advantage for Bannister. We noted that he did pitch a little more often in favorable counts, but this led to him avoiding walks more than anything; it had little salutary effect on his BABIP.

In Part 2 of this series, we learned about the pitches that Bannister threw during 2007 and how he used them. We saw that the fastball and curveball were good pitches against right-handed hitters, and the slider was a good pitch against left-handed hitters.

In this final part of the series, we’re going to marry those two approaches to see if we can uncover any patterns that might explain Bannister’s BABIP performance. In this portion, I’m not concentrating so much on evaluating Bannister’s own statements, as I did on Part 1. Rather, I’m thinking more about what we can expect from Bannister in the future. I’m also interested in investigating techniques that could prove useful for evaluating DIPS theory on a component basis as we accumulate more PITCHf/x data in the coming seasons.

Should we expect Bannister to maintain any of his BABIP edge and thus his 3.87 ERA from 2007? Or are the projection systems like PECOTA (subscribers only) and CHONE more reasonable when they project an ERA of 5.19 or 4.74?

I recently received my Hardball Times 2008 Season Preview book in the mail, and a few quotes from Bradford Doolittle’s article on the Royals encapsulate the factors that are driving the projections for Bannister:

He got lucky in terms of keeping fly balls in the park, has a low strikeout rate and has a poor groundball rate. Look out.

…when looking at measures like average on balls in play, groundball percentage and homers/fly ball, it’s pretty apparent that he was a beneficiary of good fortune last season. His list of comps isn’t encouraging, either. He’s the most likely player on the Royals to suffer a collapse in 2008.

Mainly, I will focus on the question of Bannister’s batting average on balls in play (BABIP), but let me briefly address the other concerns that Doolittle raised.

Bannister got lucky in terms of keeping fly balls in the park? Yes. Or at least his HR/flyball was lower than average, to the tune of about 5 less home runs allowed than “expected” in 2007. I’m still agnostic as to whether this is due primarily to luck or skill for every pitcher. This probably bears some investigation in Bannister’s case, but it’s a less significant factor to his run prevention than his BABIP performance, so I’m not going to examine it further in this series.

What about his ground ball percentage? As far as I can tell, it’s fairly close to normal, 42% of balls in play, compared to league average of 43%. THT and I must be using slightly different definitions, however, since they put Bannister’s number at 41% and the league at 44%. (My data comes from MLB Gameday; THT gets theirs from Baseball Info Solutions.) Ground balls are much less likely to turn into extra base hits than are fly balls or line drives. I’ll talk a bit more about this as we delve into the BABIP numbers, but given my numbers for balls in play, I wouldn’t tab Bannister’s ground ball percentage as a particular problem area.

Bannister does have a low strikeout rate, down in Chien-Ming Wang and Jon Garland territory in 2007. With an 89-mph fastball and no killer secondary pitch, that’s not likely to improve. It’s a real area for concern and the reason there is so much focus on his BABIP. If you don’t strike ’em out, you care a lot more about what happens when they put it in play.

That brings us back full circle to BABIP. If it wasn’t getting balls in play on favorable counts that caused Bannister’s outstanding BABIP performance in 2007, what was it? We’ve already taken a look in Part 2 at Bannister’s performance by measuring the average run value of each of his pitch types. I also listed the BABIP against each pitch, split out against left-handed hitters and right-handed hitters. From the tables in Part 2, you can infer which pitch helped his BABIP the most, but let’s show that explicitly as a starting point here. I also want to note again that the PITCHf/x data set covers only about half of Bannister’s pitches during 2007, so these BABIP numbers broken down by pitch type will not add up to his full season BABIP numbers.

LHH | BABIP | InPlay | Hits | Lg.Avg.Hits |

Fastball | 0.295 | 61 | 18 | 19 |

Slider | 0.200 | 25 | 5 | 7 |

Changeup | 0.212 | 33 | 7 | 10 |

Curveball | 0.750 | 21 | 9 | 3 |

Total | 0.298 | 131 | 39 | 39 |

RHH | BABIP | InPlay | Hits | Lg.Avg.Hits |

Fastball | 0.169 | 71 | 12 | 22 |

Slider | 0.314 | 35 | 11 | 10 |

Changeup | 0.300 | 10 | 3 | 3 |

Curveball | 0.125 | 18 | 2 | 5 |

Total | 0.212 | 132 | 28 | 40 |

You can see that Bannister’s BABIP advantage primarily came against right-handed hitters; against lefties he’s pretty close to the league-average mark. We have PITCHf/x data for 263 of his 538 balls in play, and his BABIP of .255 for that sample matches up pretty well with his full season BABIP of .262.

I’m a little concerned here about drawing conclusions from too small a sample size. I’ll focus on the 12-for-71 performance of right-handed hitters against Bannister’s fastball, the cause of 10 of the 12 “missing” hits we’d have expected Bannister to allow in our sample, out of 23 total hits missing from Bannister’s full 2007 season, if he’d allowed a league-average BABIP.

If I understand my binomial distribution statistics correctly, and it’s been about ten years since I last dusted them off, we can state at the 99% confidence level that Bannister’s BABIP against righties with the fastball was not merely the product of chance. Only one of the other BABIP splits meets even the 90% confidence level, that being the 2-for-18 by righties against the curveball. Even though it’s a smaller sample with less confidence, we’ll take a brief look into it, too.

So, the $64,000 question–or in Bannister’s case, perhaps it’s a $364,000 question–how did Bannister manage to get righties to hit a paltry .169 on balls in play against his pedestrian-looking 89-mph fastball? Let’s start by breaking that down by type of ball in play:

3-for-32 on fly balls

0-for-17 on ground balls

9-for-11 on line drives

0-for-11 on popups

There are at least two ways to examine this data. We know that different types of balls in play have different expected BABIP. Mitchel Lichtman published the seminal work on this topic, looking at data from 1992-2003. My numbers from 2007, and my batted ball categories, differ a little from MGL’s work, but we’re in fairly good agreement.

First, what is the BABIP for different batted ball types? Here are my numbers for the major leagues from 2007:

0.169 on fly balls

0.251 on ground balls

0.729 on line drives

0.025 on popups

(As an aside, why do people prefer ground ball pitchers if the BABIP on ground balls is higher than on that on fly balls? It’s because 78% of fly ball hits go for extra bases, whereas only 9% of ground ball hits go for extra bases.)

Second, what is the expected distribution of balls in play to the different types, more to the point, what is the expected distribution against the fastball? Here I’ll draw on one of John Walsh’s many excellent pieces on PITCHf/x. According to his numbers, the typical fastball produces 31.5% fly balls, 38.8% ground balls, 19.9% line drives, and 8.7% popups.

Let’s start with the second point. Maybe Bannister excelled with his fastball to right-handed hitters because he induced a very favorable mix of balls in play? His breakdown in our sample was 45% fly balls, 24% groundballs, 15% line drives, and 15% popups. A quick look shows us that he’s getting more fly balls and popups and less ground balls and line drives. Based on his batted ball mix alone and assuming a league-average BABIP, we would have expected Bannister to allow 18 hits in 71 at bats against the fastball. In other words, it explains almost 40% of the difference. Definitely interesting.

Again, I’ll caution that this is a small sample size. MGL’s work showed that the mix of balls in play is a much more repeatable skill on the part of a typical pitcher than is their overall BABIP, particularly with regard to ground balls and fly balls, but also to a fair extent with line drives and infield popups. However, I’m not sure we really know how to regress these numbers to the mean. I suspect the batted ball mix we see from Bannister is not mostly due to chance, but we don’t really have the studies to prove it yet one way or the other at the pitch-type level.

But Bannister only allowed 12 hits against his fastball to righties, not 18, so what about the remaining 6-hit difference? Given his batted ball mix, what would we have expected hitters to hit?

5.4-for-32 on fly balls

4.3-for-17 on ground balls

8.0-for-11 on line drives

0.3-for-11 on popups

The actual numbers on line drives and popups are close enough to the expected values that I’m going to ignore them. But Bannister saved 4+ hits on ground balls and 2+ hits on fly balls. Why? It’s instructive to look at where the 17 ground balls went. Three went to third base, 12 to the shortstop, and 2 to second base. Who played shortstop for the Royals last year? None other than defensive wiz Tony Pena, Jr. I don’t know whether Bannister did something special to get all the right-handers to hit his fastball on the ground toward shortstop, but it certainly turned out well for him. We can probably expect some regression in this area next year, but there’s also a hint that Bannister is maximizing the strengths of the defense behind him.

Looking at the fly balls, there’s also an obvious pattern to where they were hit. When right-handed hitters put the fastball into play in the air, it goes toward right field. Take a look at where the 32 fly balls landed on this ball-in-play chart (which also charts the two home runs hit by righties off the Bannister fastball in our PITCHf/x sample). Field dimensions are approximate.

The center fielder David DeJesus is fielding balls into right center field, and the right fielders (mainly Mark Teahen and Emil Brown) are mainly fielding balls from straightaway right field toward the right field line.

Here’s the chart for all types of balls in play by righties against the Bannister fastball.

The line drives follow a similar pattern to the fly balls, but since they’re hit harder, most of them fall in for hits. You can see the ground ball cluster around the shortstop. The popups Bannister induces are mostly foul balls on the first base side.

I’m not quite sure what to make of these patterns of balls in play. Bannister probably won’t get quite as lucky in the placement of balls relative to his fielders, but I don’t see any particular warning signs that scream, “Look out”, as the Hardball Times season preview put it. However, without having looked at more pitchers than just Bannister, I wouldn’t put much stock in my ability to notice any such indications.

There are other factors that could come into play. Did Bannister face weaker than average hitters? The overall answer to that question is no. Bannister’s average opponent sported a .267/.339/.424 line (avg/obp/slg), compared to the average opponent faced by the average AL pitcher at .268/.336/.423. There’s not much difference, certainly nothing that would help explain Bannister’s BABIP. What about the average right-handed hitter to whom Bannister threw a fastball? We know that weaker hitters tend to see more fastballs. Was this also true in Bannister’s case?

To answer this question, I’ll include the two home runs (hit by A-Rod) along with the 71 balls in play. In our sample, Bannister faced 37 different hitters, and I’ve weighted their performance according to the number of times each hitter faced him, for a composite fastball opponent hitting .273/.338/.436. That’s a slightly better hitter than average–there’s no sign that Bannister improved his fastball BABIP by feasting on weaker hitters. Even if you take A-Rod’s two at bats out of the equation, the numbers are still a hair above average.

I also looked at location of fastballs in the strike zone relative to what happened to the balls in play by right-handed hitters. The only thing I noted was that the low outside corner was a good place for Bannister to generate ground balls. The heart of the plate was the location for the two fastballs that A-Rod launched for home runs, but up and over the middle was also the place where Bannister generated a lot of foul pop flies.

What about the effect of previous pitches? Righties went 0-for-9 on first-pitch fastballs put into play. They went 6-for-34 (.176) following a previous fastball, 1-for-8 (.125) following a slider, 3-for-10 following a changeup, and 1-for-10 following a curveball. I could look at pitches other than the one immediately prior to the pitch put in play, but there doesn’t seem to be much of a pattern here.

We already determined in Part 1 that Bannister gained only a very small advantage overall from pitching in favorable counts. In Part 2 we learned that Bannister tended to use the fastball with 0 or 1 strike. From those two pieces of information, there’s nothing that suggests that Bannister got good results on balls in play with the fastball against right-handed hitters because he was throwing it to them in favorable counts, but we ought to take a look at that specific combination just to make sure. On what counts did righties put the fastball into play?

It turns out there’s not much to see from that angle, either. In the four counts most favorable to the pitcher (0-2, 0-1, 1-2, 2-2), the righty hitters were 6-for-28 (.214) against the fastball, and in the four counts most favorable to the hitter (3-1, 2-0, 3-2, 1-0), they went 4-for-25 (.160). Those pitcher’s counts typically result in more balls in play than the hitter’s counts; conversely, pitchers throw more fastballs in hitter’s counts, although not quite enough to balance out the first effect. So Bannister was near the average in terms of his mix of counts, and he actually got slightly better results on the fastball put in play on a hitter’s count than on a pitcher’s count.

I also promised a look at one other pitch type–why did righties hit 2-for-18 off of Bannister’s curveball? We have a much smaller sample, but let’s look at a few of the same metrics that we did for the fastball. How did righties hit on different types of batted balls against the curveball?

0-for-7 on fly balls

1-for-9 on ground balls

1-for-1 on line drives

0-for-1 on popups

The expected distribution of balls in play to the different types against the curveball is 25.2% fly balls, 48.1% ground balls, 18.5% line drives, and 6.8% popups. Did Bannister induce a favorable mix of balls in play with his curveball to right-handed hitters? His breakdown in our sample was 39% fly balls, 50% groundballs, 6% line drives, and 6% popups. He’s moved a couple line drives to the fly ball column, which is good and accounts for at least one of the three “missing” hits, but we’re so far into the realm of small sample size that we could be looking at things like scoring decisions about how to classify the batted balls as much as anything about Bannister’s pitching.

I’ll show you the ball-in-play chart for right-handed hitters against his curveball, but I’m not going any further down this path since the sample size is so small.

In summary, Bannister’s BABIP performance appears to be partly luck and partly skill. He’ll probably have a few more fly balls fall in for hits than he did in 2007, but he also induces a favorable mix of batted balls and seems to use the strengths of the defense behind him well. It’s not clear how well that will carry over.

I’d be skeptical of alarmist claims that Bannister is simply going revert completely to the league mean for BABIP. He seems to know how to use his 89-mph fastball to its best advantage, and the existing batted ball research doesn’t really know how to deal with this kind of information yet. However, repeating his league-leading .262 BABIP is also an unreasonable expectation.

As a Royals fan, I’d love to see him develop stronger secondary pitches and to continue to hone his craft in a Maddux-like fashion. It’s exciting when a major-league pitcher even knows what BABIP is. One of my favorite things about the game of baseball, perhaps my very favorite, is the game of wits that goes on between pitcher and batter. Bannister seems well-equipped for that game. I’m cheering for him in 2008 and wish him the best, but I’m not going to go out on a limb and predict his ERA. I don’t know enough to tell you that kind of thing.

I do hope that with another year of PITCHf/x data, we can gain a better understanding of the interaction between particular pitches and results on balls in play, and a result, a better ability to quantify the skills that make pitchers successful.

March 14, 2008 at 2:11 pm

Note: I have copied this comment from Brian Bannister on the original article published at MVN.Mike,

First off, I wanted to say thanks for taking the time to write these articles. There are a lot of other pitchers out there that you could have evaluated, but I know that people are interested in the way that I approach the game of baseball.

Normally, I refuse to comment outside of formal interview requests (and readers, don’t expect me to get into a discussion here), but this series was definitely worthy of a response because of its objectiveness and detail (and from a professed Royals fan).

With regards to the MLBtraderumors.com interview Mike, I didn’t mean to cause any confusion, but I was including strikeouts in my percentages (and they were intended to be batting averages only, not BABIP’s). My last line was also incorrect (I need an editor) with regards to balls in play, because they were simply overall batting averages only.

The point of answering that question was not to prove why I beat the league average, but how any pitcher could hypothetically do it. I don’t claim to be able to do it every year (or ever again), but it’s a challenge to find ways that it could be done by giving yourself statistical advantages over the league.

Moving on to this series, you have to come to many of the same conclusions that I have with regards to my own pitch repertoire. I have been evaluating PITCHf/x data since the middle of 2007 because I feel that it is a far superior system to watching video. Video is still important in learning hitters tendencies and body language, but for pitchers, PITCHf/x is the best. Even though an easier system for managing and graphing the data needs to be developed, the possibilities are endless.

I’m glad that you took the time to evaluate my pitch mix and batted ball types, because that is something I have been working with for a long time now, and is never discussed in a typical BABIP article. To me, this is where successful pitchers are separated from unsuccessful ones (and by the way, I didn’t throw any two-seamers or cutters last year, and yes, I did experiment with a harder curveball to see if would be more effective against lefties, nice pickup).

What you’ve been able to publicly prove is which of my pitches are better than league average, which ones are worse, what is better to lefties, and what is better to righties. Every pitcher has strengths and weaknesses, but not everyone knows how to use them to their statistical advantage in different situations throughout a game and a season.

For example, I have tried to throw a two-seam fastball at two points in my career – 2004 in the minor leagues (my worst statistical year) and 2007 spring training (with terrible results). This just reinforces what the PITCHf/x and batted balls numbers tell me, that despite what the typical fantasy writer would describe as a “fringe average” fastball, it is in fact an above-average pitch because of the superior batted ball mix that it generates (which has been consistent in years 2003, 2005, 2006, 2007 in which I threw it). What is unique about it (that I have known for several years and that you were finally able to show publicly) is that it generates more batted balls with a higher chance of producing outs than average, while also having a lower than average HR/9 rate. I’ve fought the data in the past (2004, 2007 ST), but now I’ve learned to trust it and tweak it to my advantage.

I have a lot of other things that I could talk about, but I’ll save those for future interviews and as I “generate more data” for you to analyze.

The one thing that I always want to reinforce is that I’m really just a huge fan of the game of baseball. I am fortunate because I get to put on a uniform every day and experiment with the crazy ideas in my head and those that wonderful baseball fans like yourself write about, all in an attempt to be a better player and to help my team win more games. I’m just a normal guy trying to live the dream, and I appreciate your love for the game very much. Never lose it.