Note: This article was originally published at the Statistically Speaking blog at MVN.com on January 29, 2008.  Since the MVN.com site is defunct and its articles are no longer available on the web, I am re-publishing the article here.

Despite winning the American League West with a 94-68 record last year, the LA of Anaheim Angels have gotten short shrift from the PITCHf/x analysts thus far. The only writeup that the pitching staff has gotten was one by Joe Sheehan on John Lackey three weeks into the season. I’d like to remedy that a little bit today. The Angels had three outstanding starters: Lackey, Kelvim Escobar, and Jered Weaver. Let’s take a detailed look into the pitching performance of Kelvim Escobar.

Escobar is a 31-year-old right hander from LaGuaira, Venezuela. He was a former starter turned reliever (and closer) and back to starter again for the Toronto Blue Jays before joining the Anaheim Angels in 2004. He’s struggled to stay completely healthy, but overall he has turned in some fine numbers for the Angels in four years: a 43-35 record and 3.60 ERA in 109 starts, allowing 611 hits and 213 walks against 561 strikeouts in 653 innings.

Since the Big A was one of the original nine stadiums to have a camera system installed from the beginning of the 2007 season, the large majority of Escobar’s season was recorded by the PITCHf/x system, 2469 of his total 3141 pitches. This gives us a good data set to identify his pitches and examine his pitching tendencies.

Escobar throws quite an array of pitches: a four-seam and two-seam fastball, a changeup and split-finger, a slider and a curveball. According to scouting reports, he is capable with all six pitches.

Here I’ve shown two graphs that I use for pitch classification. The first graph shows the speed of his pitches versus the direction they break, in polar graph format. The second graph shows the movement due to the forces of spin deflection and gravity on his pitches in the last quarter-second before they cross the plate.

There are a couple other ways to look at the vertical vs. horizontal deflection over the whole pitch trajectory:


Escobar’s four-seam fastball runs 92-96 mph, and the average spin deflection he gets on the four-seamer is a 10-inch hop and a 4-inch tail in toward right-handers. Compared to a league-average fastball, that’s 3 mph faster but with a couple inches less lateral movement, probably due to the fact that his motion is more over-the-top than many right-handed pitchers. The four-seamer is one of Escobar’s main pitches to both lefties (26% of the time) and righties (27%).

Escobar’s two-seam fastball also runs 92-96 mph, but its average spin deflection is an 8-inch hop and a 7-inch tail in toward right-handers. The two-seamer is his primary pitch to lefties (28% of the time) and also a main pitch to right-handers (24%). I made the division between the four-seamer and the two-seamer by looking at the spin direction of each pitch on a game-by-game basis, but the dividing line between the two is still a bit fuzzy to me.

His split-finger fastball runs 85-89 mph, and its average spin deflection is a 6-inch hop and a 6-inch tail in toward right-handers. Escobar uses the splitter fairly often to left-handers (15% of the time) but only infrequently to right-handers (6%).

His changeup runs 83-87 mph, and its average spin deflection is a 10-inch hop and a 3-inch tail in toward righties. The 9-mph separation between his fastball and changeup is about average for major league pitchers. He uses the changeup more often to lefties (16% of the time) but also some against righties (11%).

Escobar’s slider runs 85-89 mph, and its average spin deflection is a 3-inch hop and a 2-inch break away from righties. That’s about 3 mph harder than the average major-league slider, with typical movement. The slider is one of his favorite pitches to right-handed hitters (25% of the time) and is rarely used against lefties (2%).

Finally, his curveball runs 79-84 mph, and its average spin deflection is a 3-inch drop and a 1-inch break away from right-handers. That’s about 4 mph harder than the average major-league curveball, with 12-to-6 movement that is somewhat rare. (The spin deflection on the average major-league curveball is a 2-inch drop and a 5-inch cut. John Walsh’s article is my source for league average numbers.)

Next, let’s look at how Escobar mixes his pitches in different ball-strike counts, which I’ve split out by batter handedness. The picture gets a bit messy when a man throws six different pitches, but let’s dive in and see what we see.


To lefties, Escobar uses the four-seamer on any count and relies on it a little more if he falls behind. He throws the curveball early in the count, 22% of the time with no balls, 9% of the time with 1 ball, and only 3% of the time with 2 or 3 balls in the count. He favors the two-seamer with 0 or 1 strike, 33% of the time, but uses it only 16% of the time with 2 strikes. Instead, with 2 strikes he relies on the splitter 32% of the time. He’ll throw the changeup at almost any count except 0-2 and 3-0, but he likes to throw it more when he’s behind in the count, in which case he throws it 25% of the time.

Early in the count with Escobar, lefties should expect to see the two-seamer, the four-seamer, the curveball, and the changeup, in that order. If Escobar gets the hitter down 0-2 or 1-2, he should expect the splitter (41% of the time) or perhaps a fastball (41%), but if the count goes 2-2 or 3-2, he should start to watch for the changeup, too (33%).

To righties, early in the count, Escobar throws hard stuff, 31% two-seamers, 28% sliders, 23% four-seamers, and only 18% of his other three pitches combined. When he gets 2 strikes, the two-seamer disappears (only 3%), but he’s willing to show the splitter (14%). The changeup gets used a little with 1 strike (11%), but at 2-1 or 2-2 it’s a favored pitch (26%), and at 3-2, it’s his favorite pitch (34%), like it was to lefties. Righties can expect the curveball mainly at a single count: 0-2, where Escobar uses it 28% of the time; it’s little used (6%) in other counts.

What kind of results does Escobar get with each of his pitches? His four-seam fastball is a pretty good pitch, but his two-seamer grades out poorer. All four of his off-speed pitches are above average. I should mention that the PITCHf/x games for Escobar are missing his two worst starts of the year, which skews all the following numbers a little bit in his favor.

LHH _Ball_ _CS_ _Foul_ _SS_ InPlay _Avg_ _BABIP_ _SLG_ __HR__
4-seamer 0.33 0.26 0.17 0.06 0.18 0.315 0.315 0.444 0.000
2-seamer 0.40 0.17 0.18 0.05 0.20 0.338 0.317 0.523 0.031
Splitter 0.35 0.09 0.19 0.17 0.19 0.294 0.273 0.441 0.029
Changeup 0.39 0.14 0.15 0.12 0.20 0.216 0.216 0.297 0.000
Slider 0.22 0.11 0.39 0.06 0.22 0.500 0.500 0.750 0.000
Curveball 0.35 0.32 0.09 0.13 0.11 0.353 0.353 0.412 0.000
RHH _Ball_ _CS_ _Foul_ _SS_ InPlay _Avg_ _BABIP_ _SLG_ __HR__
4-seamer 0.41 0.17 0.20 0.08 0.14 0.176 0.176 0.216 0.000
2-seamer 0.39 0.26 0.14 0.04 0.17 0.415 0.392 0.585 0.038
Splitter 0.36 0.04 0.15 0.19 0.26 0.158 0.158 0.211 0.000
Changeup 0.27 0.06 0.12 0.28 0.26 0.237 0.216 0.316 0.026
Slider 0.35 0.17 0.11 0.16 0.22 0.254 0.243 0.352 0.014
Curveball 0.39 0.23 0.11 0.14 0.14 0.154 0.154 0.154 0.000
Lg. Avg. _Ball_ _CS_ _Foul_ _SS_ InPlay _Avg_ _BABIP_ _SLG_ __HR__
Fastball 0.36 0.19 0.19 0.06 0.19 0.330 0.304 0.521 0.037
Changeup 0.40 0.11 0.14 0.13 0.21 0.319 0.295 0.502 0.035
Slider 0.36 0.14 0.17 0.13 0.20 0.310 0.286 0.481 0.033
Curveball 0.40 0.19 0.13 0.11 0.21 0.310 0.290 0.471 0.029

The league average information comes from John Walsh’s article. In the following pitch location charts, I’ve changed my color-coding a bit to try to improve readability for those with color blindness. Hopefully the new system is an improvement.

Escobar works with the four-seamer on the outer half of the plate to both lefties and righties, although with lefties he works down more and avoids coming inside, and with righties he works up more and works inside just off the plate. He has some trouble throwing the four-seamer for strikes to righties (only 59%, compared to 64% league average), but when he does, and they put in play, he gets very good results: .176/.216 (avg/slg), compared to .330/.521 major-league average off the fastball.

To lefties, he’s much better at throwing the four-seamer for strikes (67%), and he gets a lot of called strikes (26% compared to 19% league average), but his results on balls in play are only fair: .315/.444 avg/slg. He didn’t allow a single home run in 31 fly balls hit off the four-seamer in PITCHf/x games. That is unusual–fastballs are the most homered-upon pitch for most pitchers.

Escobar has trouble throwing the two-seamer for strikes, getting it over only 60% of the time. As with the four-seamer, he works mainly on the outer part of the plate to both lefties and righties. However, both lefties and righties have good success when they put the two-seamer into play. Lefties hit .338/.523, and righties hit .415/.585.

The splitter is Escobar’s strikeout pitch to lefties, and you can see why. They swing and miss at it down and away more often than not. He doesn’t necessarily throw it in the strike zone that much, but he gets strikes because the hitters chase it. When he does get it in the zone, hitters do much better with it, making at least decent contact and racking up a .294/.441 line, including a home run.

He doesn’t throw the splitter nearly as much to righties, although I wonder if maybe he should. He still gets a lot of swings and misses (19%, compared to 13% league average), but righties are able to put the ball in play almost every time he gets the splitter in the zone. However, the right-handed hitters don’t fare nearly as well as lefties on balls in play, hitting only a meager .158/.211. Perhaps it’s the small sample size (19 balls in play), or maybe righties really do have trouble getting good wood on the splitter.

The changeup is the first pitch where we see a marked contrast in Escobar’s location to lefties and righties. To lefties, he pitches away, away, away. He gets some swings and misses in the zone, but lefties don’t chase the changeup out of the strike zone much. On balls hit into play by lefties, Escobar does well, a .216/.297 line, compared to .319/.502 against an average major-league changeup.

To righties, he throws the changeup mostly in the zone or on the corner low and away. He gets a lot of swings and misses, especially on the outside corner. The changeup is a very effective pitch against righties. No wonder he likes to throw it as a strikeout pitch to righties. Moreover, even though he pounds the heart of the zone, righties have little luck on balls in play, hitting only .237/.316. Most right-handed pitchers avoid throwing the changeup to right-handed hitters, but for Escobar in that situation, it’s a great pitch and one he could perhaps use even more often.

As you can see, his slider is rarely used to lefties, mostly thrown up and in and fouled off. To righties, he uses the slider a lot, and to good effect. He gets a good number of called strikes (17%, versus 14% league average) and swinging strikes (16%, versus 13% average), and when the ball is put in play, Escobar also fares well, allowing a .254/.352 avg/slg, compared to .310/.481 against an average major-league slider. Those numbers include allowing only 1 home run on 27 fly balls hit by righties off the slider–luck or skill?

Finally, we come to the curveball, Escobar’s least-used pitch. He throws it mostly down and away to both righties and lefties, although he also throws it in the zone quite a bit. He gets a lot of called strikes, especially to lefties (32%), but also to righties (23%), compared to league average of 19% with the curve. Most pitchers rarely throw the curveball as the first pitch to a batter. Escobar, on the other hand, often throws a lefty a curveball right across the plate for strike one looking. Lefties don’t often make contact with the curveball, but when they do, the results are decent: .353/.412, compared to league average against the curve of .310/.471.

Right-handers see the curveball more often with two strikes, and it’s a good strikeout pitch for Escobar, both swinging (at balls in the dirt) and looking. Righties don’t make contact with the curve very often, either, and when they do, their results are particularly poor: in 13 curveballs in play, righties hit 10 groundballs (including two double plays), 2 fly balls, and one line drive. The line drive and one groundball landed as singles, for a .154 average.

In summary, Escobar has a solid four-seam fastball which he complements with a weaker two-seamer, and his array of off-speed pitches is impressive. His changeup, splitter, curveball, and slider are all well above average pitches, and some of them, particularly his changeup, are among the best in baseball. He struggles with control on his fastball, and this, along with the recurrent health problems, is probably all that keeps him from being one of the very best pitchers in baseball.

As a final note, I thought this was a great photo from MLB.com of Kelvim Escobar in full stride.

If you enjoyed this article, you might be interested in my similar previous analysis of Erik Bedard, Johan Santana, James Shields, Mariano Rivera, Joakim Soria, Josh Beckett, Joba Chamberlain, or Eric Gagne.

Advertisements

Note: This article was originally published at the Statistically Speaking blog at MVN.com on January 9, 2008.  Since the MVN.com site is defunct and its articles are no longer available on the web, I am re-publishing the article here.

Who is the best pitcher in baseball right now? Some might answer that question with Jake Peavy or Josh Beckett, but I’d guess that at least 7 out of 10 times, the answer you would get is Minnesota Twins left-hander Johan Santana. Santana is a 28-year-old from Tovar, Venezuela, and after his fourth full year in the starting rotation, he already owns two Cy Young Award trophies.

Now, as Santana approaches the final season of the 4-year, $39.75 million contract he signed three years ago, the Twins appear eager to trade him, and the reported suitors include such teams as the New York Yankees, Boston Red Sox, and New York Mets, subject to Santana’s approval. I’ll leave the predictions of where he’ll land to those who are better qualified or more eager to comment than I am. However, I’d like to take a look at the pitching repertoire and strategy of possibly the best pitcher in baseball.

If you look at the scouting reports, they all talk about Johan Santana’s devastating changeup and how he works to make his throwing motion identical for all pitches. Most scouting reports list three pitches for Santana–fastball, changeup, and slider–and mention that his changeup comes in 15-20 mph slower than his fastball. Were this true, it would be highly unusual. Most major league changeups are 7-10 mph slower than the pitcher’s fastball. A few scouting reports speak of five pitches–two fastballs, a slider, a circle change, and a straight change. The most useful and interesting scouting information I found was an interview from 2006 that Pat Borzi conducted for the Sporting News with Johan Santana and his catcher Joe Mauer.

Santana throws four pitches for strikes-four- and two-seam fastballs between 92 and 95 mph, a slider/curve in the 84- to 87-mph range and a changeup that’s about 15 to 20 mph slower than the fastball. The changeup is his strikeout pitch; when Santana is on, he throws it from the same arm angle and release point as his fastball, and hitters can’t tell the difference until it’s too late.

I also found this quote from Santana interesting given that most people acknowledge his changeup as his best pitch:

“I want to make sure my two-seam fastball is working,” Santana says. “That’s my best pitch, and it’s going to make my other pitches look even better. That’s what I try to do all the time.”

We have detailed data from the PITCHf/x system for 1032 of Santana’s 3345 pitches during the 2007 season. Let’s dive in and see what we can learn about Santana’s repertoire and effectiveness with his various pitches.

Santana has at least three obvious pitch groupings: fastball, changeup, and breaking ball. Here I’ve shown two graphs that I use for pitch classification. The first graph shows the speed of his pitches versus the direction they break, in polar graph format. The second graph shows the movement on his pitches in the last quarter-second before they cross the plate, due to the forces of spin deflection and gravity.

The fastballs run 89-95 mph, and it’s hard to tell from these graphs alone whether Santana really does throw two different fastballs or just one. Through additional analysis, which I will explain shortly, as well as Santana’s own comments, I concluded that he did in fact throw a four-seam and a two-seam fastball and have coded them separately in these graphs.

We can also see that Santana throws two different offspeed pitches. One has a movement very similar to the fastball but is thrown slower at 80-84 mph. This is his changeup. It’s interesting to note that we see a 10 mph difference in speeds between his fastball and his changeup, typical of other major league changeups and nothing like the 15-20 mph difference that was reported by other sources. I don’t know if that was just the stuff of legend or whether Santana has changed his approach in recent years. More likely, people were comparing Santana’s very slowest changeup with his very fastest fastball and writing as if that represented a typical pitching pattern.

I could not find any sign of two different changeups in Santana’s repertoire, at least not two changeups that consistently have different movement or speed.

Santana’s other offspeed pitch is an 83-88 mph breaking ball, described in various scouting reports as either a slider or a curveball. Based on the spin direction, the speed, and the direction of break, it’s very clearly a slider. In the first graph of pitch speed vs. spin deflection angle, the calculation of the spin deflection angle for some of the sliders contains a good deal of error since the spin of those sliders is nearly aligned around the direction of travel of the pitch, resulting in spin deflection of only a couple inches or less. This is one of the classic indicators of a slider.

The sliders and changeups look difficult to separate at the margins in the two graphs I presented above, but including the (x-z component of the) spin rate in the discussion makes that task much easier.

Returning to the topic I mentioned earlier, how did I determine whether Santana threw both a four-seam and a two-seam fastball? Looking at the data in aggregate, it was impossible to see a dividing line, but when I examined the spin and break on a start-by-start basis, a little bit of order appeared out of the murkiness. In some starts, two separate groupings were obvious. In most starts, the dividing line was subtle. In a few cases, it was hard to find a dividing line at all. I did notice that the fastballs with the most sink and the slowest speed were thrown almost exclusively to right-handed hitters, and this, in addition to Santana’s own comments about throwing a two-seamer, gave me confidence in making a distinction between the two fastballs.

If you look at the comments from John Walsh and John Beamer on my Erik Bedard analysis, you’ll see that having to examine the data on a start-by-start basis in order to make an accurate pitch classification diagnosis is a recurring problem. We’d like to be able to look at a pitcher’s season data as a whole. This is an important area for further investigation.

Here are a couple more traditionally-used PITCHf/x graphs of pitch movement for those who are interested:


How does Santana use his pitches to left-handed and right-handed hitters? As a left-handed pitcher, he naturally sees predominantly right-handed hitters, making up 75% of his opponents. To righties, he throws about 41% four-seam fastballs, 35% changeups, 18% two-seam fastballs, and 6% sliders. To lefties, he throws 60% fastballs, 29% sliders, 7% changeups, and 4% two-seam fastballs. Against righties he’s the stereotypical fastball-changeup Santana that I’ve heard about. Against lefties, he’s a totally different pitcher, eschewing the changeup and the two-seam fastball and relying on a fastball-slider combination.

Next, let’s look at how Santana mixes his pitches in different ball-strike counts. I’ve split this out by batter handedness as well.

Against righties, you can see that the changeup is his favorite pitch with two strikes (57% of the time), and he mixes in his two-seam fastball more if he falls behind in the count (28% when behind vs. 15% when ahead or even).

Against lefties, he’s relies on the four-seamer about 70% of the time in most situations. With two strikes he feels confident enough to occasionally (14%) introduce the changeup to lefties, and on an 0-2 count, you can count on getting a slider two thirds of the time.

What’s the bottom line–what results does Santana get with his pitches? I attempted for a while to cast the answer to that question in terms of run values for each pitch determined by linear weights, but I’ve postponed that endeavor for the moment. There are too many pieces that I haven’t figured out how to put together yet. So here are the results in the same format I used in the Bedard article.

LHH Ball CS Foul SS InPlay Avg BABIP SLG HR
Fastball 0.32 0.20 0.25 0.10 0.13 0.316 0.188 0.842 0.158
Sinker 0.70 0.10 0.00 0.10 0.10 0.000 0.000 0.000 0.000
Slider 0.34 0.13 0.17 0.17 0.19 0.308 0.308 0.462 0.000
Changeup 0.24 0.06 0.18 0.24 0.29 0.400 0.400 0.400 0.000
RHH Ball CS Foul SS InPlay Avg BABIP SLG HR
Fastball 0.32 0.20 0.26 0.12 0.11 0.235 0.188 0.500 0.059
Sinker 0.35 0.17 0.24 0.06 0.19 0.333 0.250 0.741 0.111
Slider 0.38 0.12 0.24 0.08 0.18 0.111 0.000 0.444 0.111
Changeup 0.32 0.08 0.15 0.31 0.15 0.357 0.325 0.667 0.048
Lg. Avg. Ball CStrk Foul SStrk InPlay Avg BABIP SLG HR
Fastball 0.36 0.19 0.19 0.06 0.19 0.330 0.304 0.521 0.037
Sinker
Slider 0.36 0.14 0.17 0.13 0.20 0.310 0.286 0.481 0.033
Changeup 0.40 0.11 0.14 0.13 0.21 0.319 0.295 0.502 0.035

The league average information comes from John Walsh’s article, and once again I’m using an adaptation of his format to present this information.

The four-seamer is Santana’s bread and butter, especially to lefties, and a good bit of creamy butter it has. He throws it for strikes and gets more swings and misses with it than most pitchers do. Hitters have a hard time putting the four-seamer into play, and when they do, Santana also gets really good results (a .188 BABIP compared to .304 league average BABIP on the fastball), although lefty batters–Hafner, Sizemore, and Thome–did hit three home runs off the four-seamer in our data set. He mostly pounds the zone with the pitch to both lefties and righties, although there appears to be some tendency toward pitching up and away from lefties and up and in to righties.

Santana doesn’t use the two-seamer much against lefties, and when he did, it was mostly for a ball. He works in the zone against righties and gets fairly average results with the two-seam fastball. One surprising thing to note is that he still gives up a lot of fly balls off the two-seamer; almost 70% of balls in play off the two-seamer were fly balls. The two-seamer seems like his weakest pitch based on the results we have from 2007, so I’m not sure I understand his statement from the Sporting News interview that it’s his best pitch.

Just look at all the red bleeding over the graph from the swinging strikes, and you know all you need to know about Santana’s changeup. The hitters can’t hit it. Santana can throw it for strikes just as well as his fastball. He throws it down and away from righties, and he gets a lot of swings and misses when they chase the changeup down out of the strike zone. When he gets it too close to the heart of the zone, they do make decent contact. It would go without saying, but this is an outstanding pitch.

Against lefties, Santana uses the slider mostly down and away, and he gets pretty average results with it. Against righties, he features the slider less often. When he does throw it, he keeps it inside. When he gets it up, it gets put in play, but he had fairly good results on a limited sample of balls in play except for one slider that Alex Rios launched 414 feet into the left field seats at the stadium formerly known as SkyDome.

I also looked a bit at pitch sequencing. Here’s a table showing what pitch a hitter is most likely to see from Santana based on what the previous pitch was.

LHH
Previous Pitch Fastball Sinker Slider Changeup
Fastball 66% 4% 26% 4%
Sinker 67% 0% 33% 0%
Slider 60% 9% 27% 4%
Changeup 76% 0% 24% 0%
RHH
Previous Pitch Fastball Sinker Slider Changeup
Fastball 52% 16% 5% 27%
Sinker 46% 21% 3% 31%
Slider 42% 30% 9% 18%
Changeup 43% 15% 8% 34%

I don’t notice any particular patterns to lefties, but to righties he’s more likely to throw the two-seamer after a previous two-seamer, and he’s more likely to throw a changeup after another changeup.

Johan Santana had yet another great season in 2007. He allowed a few more walks and home runs than in previous years, but without PITCHf/x data from previous seasons, I don’t have any way to know whether that was simply luck or a change in his pitching abilities and strategies.

I looked at the 11 home-run balls off Santana for which we have PITCHf/x data, and I couldn’t detect any useful patterns. They were mostly hit off pitches up and over the plate, but that doesn’t come as much of a surprise. Looking at the HitTracker data, he wasn’t burned by many short home runs barely sneaking over the fence, so he wasn’t unlucky in that regard, at least. This may be a topic for further investigation or possibly just the result of Santana being a fly ball pitcher and getting a little unlucky with how hard the hitters hit 33 of those fly balls in 2007.

Santana obviously has an outstanding changeup and a strong fastball, but you probably knew that already. What I didn’t know was how infrequently he uses the changeup against lefties or most of the other nuances of his pitching strategy. Unless you’re Joe Mauer or Mike Redmond (in which case, Hi!), hopefully you feel like you know the best pitcher in baseball a little better than you did before.

If you’re an employee of a Mr. Steinbrenner or a Mr. Henry gathering information for a future trade, by all means feel free to contact to me regarding where to send that check for my services. 🙂

Note: This article was originally published at the Statistically Speaking blog at MVN.com on January 3, 2008.  Since the MVN.com site is defunct and its articles are no longer available on the web, I am re-publishing the article here.

Baltimore Orioles ace Erik Bedard had a breakout season in 2007 before being sidelined at the end of August with an injury to his oblique muscle. Prior to his final start, he was receiving strong Cy Young Award consideration with a 13-4 record, a 2.97 ERA, and a league-leading 218 strikeouts and only 135 hits and 52 walks allowed in 176 innings. (His last start, pitching injured, resulted in final season numbers a little worse than these I’ve listed.) Compare his 2007 numbers to his previous career-best season in 2006, when he finished 15-11, with a 3.76 ERA and 171 strikeouts against 196 hits and 69 walks in 194 1/3 innings, and it’s clear he stepped up his game in 2007. How did he do it?

Unfortunately we don’t have any PITCHf/x data from the 2006 regular season, but we can use the PITCHf/x microscope to take a look at what Bedard did in 2007 (in the 701 pitches for which we have detailed data). What does Erik Bedard throw? Let’s take a look at his repertoire by graphing the speed of his pitches versus the direction they break.

Bedard has four pitches, as best I can tell. His famous erstwhile pitching coach will tell you he throws four different types of fastballs, but I can only see a four-seamer and a cutter. Either he throws the sinker and the “comebacker” infrequently, if at all, in game action, or they move too similarly to the four-seamer for me to differentiate them using this data.

I mentioned already that Bedard pitched with an injured oblique muscle in his final start of the year on August 26. Most of the fastballs and cutters with a speed below 90 mph were recorded in that start. He averages just over 93 mph on his four-seam fastball and 92 mph on his cut fastball. In his August 26 start, those clocked at 89 and 88 mph, respectively.

When healthy, his four-seam fastball runs 92-95 mph and breaks away from a right-hander by about 7-11 inches. The four-seamer is one his two primary pitches to right-handed hitters; he throws it 34% of the time. Against lefties, it’s his third pitch, used only 23% of the time.

His cut fastball runs 90-94 mph and breaks away from a right-hander by about 2-6 inches. The cutter is his primary pitch to lefties, used almost half the time (45%); against righties, it’s his third pitch, used 24% of the time.

Bedard also throws an occasional 80-83 mph changeup, almost exclusively to lefties (7%). Probably his best pitch is a 76-80 mph curveball, which he uses equally to righties (35%) and to lefties (32%).

Let’s take a look at how all his pitches move, including the effect of gravity in addition to spin-induced deflection.

The slower pitches like the curveball and changeup drop more because gravity has longer to act on them.

I thought it might also be interesting to show something more in line with what I believe the hitter perceives as the “late break” on a pitch, the deflection of the pitch due to both spin and gravity in the last quarter-second before it crosses the plate. Thanks go to Tom Tango for this idea.

This seems to give a more realistic guess at how a hitter might perceive the drop on a curveball compared to a fastball.

Let’s take a look at how Erik Bedard mixes his pitches in different ball-strike counts.

We can see that he uses his cutter more often early in the count or when he falls behind and his four-seam fastball more often with two strikes. He uses his curveball equally across almost all counts, except for avoiding it on 3-0 and 3-1 and showing some preference for it on 2-1 and 3-2 counts. His changeup shows up mostly on 0-1 and 1-1 counts; he throws it to righthanders 20% of the time on those two counts and only 3% of the time on other counts.

Here’s a table showing the details by count.

Count Fastball Cutter Changeup Curveball Total
0-0 54 62 4 62 182
0-1 20 26 21 33 100
0-2 27 10 0 15 52
1-0 16 22 4 22 64
1-1 18 20 9 23 70
1-2 35 8 1 24 68
2-0 5 10 0 4 19
2-1 11 5 3 20 39
2-2 24 11 0 25 60
3-0 0 4 0 0 4
3-1 8 8 0 1 17
3-2 4 8 0 14 26
Ahead 82 44 22 72 220
Even 96 93 13 110 312
Behind 44 57 7 61 169
0 strikes 75 98 8 88 269
1 strike 57 59 33 77 226
2 strikes 90 37 1 78 206
Ball 0-1 170 148 39 179 536
Ball 2-3 52 46 3 64 165
Total 222 194 42 243 701

Now, let’s examine where in the zone Bedard throws his pitches and what results he gets with them.

LHH Ball CStrk Foul SStrk InPlay Avg BABIP SLG HR
Fastball 0.48 0.16 0.16 0.03 0.16 0.600 0.600 0.600 0.000
Cutter 0.26 0.30 0.20 0.11 0.13 0.250 0.000 1.000 0.250
Changeup 1.00
Curveball 0.36 0.09 0.20 0.16 0.18 0.250 0.250 0.625 0.000
RHH Ball CStrk Foul SStrk InPlay Avg BABIP SLG HR
Fastball 0.35 0.20 0.21 0.08 0.16 0.258 0.233 0.419 0.032
Cutter 0.34 0.26 0.24 0.02 0.15 0.300 0.263 0.500 0.050
Changeup 0.54 0.05 0.12 0.02 0.27 0.273 0.200 0.636 0.091
Curveball 0.34 0.18 0.16 0.21 0.11 0.227 0.227 0.318 0.000
Lg. Avg. Ball CStrk Foul SStrk InPlay Avg BABIP SLG HR
Fastball 0.36 0.19 0.19 0.06 0.19 0.330 0.304 0.521 0.037
Cutter
Changeup 0.40 0.11 0.14 0.13 0.21 0.319 0.295 0.502 0.035
Curveball 0.40 0.19 0.13 0.11 0.16 0.310 0.290 0.471 0.029

The league average information comes from John Walsh’s article, and I’ve adapted his format in presenting this information. His pitch types probably don’t correspond exactly to mine since he lumps sinkers and cutters in with four-seam fastballs and splitters in with changeups. I believe it’s still helpful to use his league-wide information for comparison since I haven’t established a league-wide baseline on my own yet.

With the four-seam fastball, Bedard mostly works the outside part of the plate, especially to lefties but also to righties. To lefties, he mostly stays up or away with the fastball, out of the strike zone, and when he does get in the zone, he doesn’t have very good results, although the sample size is small. Against righties, he gets very good results with the fastball, holding them to a .233 BABIP and a .419 slugging percentage.

With the cut fastball, Bedard works away from lefties and gets a lot of called strikes and not much good contact, although two cutters in the middle of the zone did go for home runs. Against righties, he’s all around the zone with the cut fastball, and his results aren’t quite as outstanding. He gets a few more foul balls and a lot less swinging strikes, but overall his results with the cutter are still pretty good against righties.

Erik Bedard threw one changeup to a lefty, Lyle Overbay, out of the 701 pitches in our data set, and that resulted in a fly ball out. To righties he works the changeup down and away, mostly out of the strike zone. When he gets it in the zone, they make contact. The changeup looks like Bedard’s weakest pitch.

Bedard throws the curveball down and away to lefties, and he generates a lot of swings with it–foul balls, swings and misses, and balls in play. Against righties he also works down and away but isn’t afraid to throw it in the zone. He gets a lot of swings and misses and when the ball is put in play, it’s hit weakly (.227 AVG and .318 SLG). The curveball is a great pitch for Bedard; no wonder he throws it so much.

I wanted to add a note at the end here about which pitches Bedard used to get his strikeouts. We have PITCHf/x data for 50 of his 221 strikeouts. Of those 50 K’s, 22 of them came on the fastball, 21 on the curveball, and 7 on the cutter. That lines up pretty well, percentage-wise, with his pitch mix with two strikes on the hitter.

Hopefully, we’ve learned a little about how Bedard dominated hitters in 2007–a strong fastball/cutter combo and an outstanding curveball. His changeup could use improvement, but it’s his fourth pitch, so that’s really a small complaint. It will be interesting to see if he can maintain the strong performance in 2008 as well as whether he will be doing so as part of the Orioles or on a different team.

Note: This article was originally published at the Statistically Speaking blog at MVN.com on January 14, 2008.  Since the MVN.com site is defunct and its articles are no longer available on the web, I am re-publishing the article here.

Many of you are hopefully familiar with the PITCHf/x system and at least some of the data and analysis that have been produced on the subject over the past year, but it may be completely new to some of you. In either case, I thought it would be helpful to provide an introduction and tutorial on the information that is available. I’ll point toward some existing resources and try to fill in some of the gaps. I’ve divided this primer into sections so you can easily skip to the parts that interest you.

  1. What is PITCHf/x?
  2. How do I get and use the data?
  3. Where can I find resources?
  4. How do I identify pitch types?
  5. How do I interpret graphs?
  6. Is the data reliable?
  7. Where can I go for further discussion and study?

1. What is PITCHf/x?

PITCHf/x is a system developed by Sportvision and introduced in Major League Baseball during the 2006 playoffs. It uses two cameras to record the position of the pitched baseball during its flight from the pitcher’s hand to home plate, and various parameters are measured and calculated to describe the trajectory and speed of each pitch. It was instituted in most ballparks throughout MLB as the 2007 season progressed, such that we have PITCHf/x data for a little over a third of the games from 2007. MLBAM used the PITCHf/x data in their Enhanced Gameday application and also made the data freely available for downloading and research.

In some ways, PITCHf/x is a bridge between scouting and analysis, giving us an objective window into the batter-pitcher matchup at a level we’ve never seen before. In 2008, the system should be installed in every major-league ballpark, and we will hopefully have complete detail for every pitch, although MLB has not committed to whether all the data will continue to be freely available in the future.

2. How do I get and use the data?

If you want to look at the XML data from a single game, you can go to the MLB website and browse through the files. Data is organized by year, month, day, and game. Within each game directory are a number of subdirectories containing the data in XML format. If you want to see the detailed pitch information within the game context, I suggest looking at the files in the inning subdirectory. If you want to see all the pitch information for a particular pitcher, you can go the pbp/pitchers subdirectory, but you need to know Elias playerID for your pitcher of interest. If you want to know what the various XML pitch data fields mean, read my glossary.

If you want to manipulate and analyze a single game’s worth of data, you can download and import the XML files into a Microsoft Excel spreadsheet. Dr. Alan Nathan has laid out the steps for you at his Physics of Baseball site.

If you want to get a little more hardcore, you can download the XML data for every game in the 2007 season. Using Perl scripts adapted from Joseph Adler’s Baseball Hacks, I downloaded the data and parsed it into a MySQL database. I’ve outlined the steps needed for you to do this yourself and shared the Perl code to give you a head start. (I’m not aware of anyone who’s gotten the Perl-to-MySQL path working on a Mac, so if you have, please drop me a line.)

3. Where can I find resources?

Probably the most popular and valuable PITCHf/x resource on the web is Josh Kalk’s collection of player cards. Josh has classified every pitch as either a fastball, sinker, cutter, splitter, changeup, slider, curve, or knuckleball using a clustering algorithm and made graphs of pitch speed, movement, and release point for every pitcher with at least 100 pitches recorded by PITCHf/x. Strike zone charts are available for hitters. This is a great resource that reminds me in some ways of Wikipedia: the depth, breadth, and accuracy of the information is amazing, doubly so since it’s free, but the accuracy isn’t perfect, and it’s worth keeping that in mind. Stuff that looks quirky to you may in fact be quirky. (Felix Hernandez does not throw a 100-mph splitter.)

Josh Kalk has also developed a PITCHf/x tool that allows you to query his database for a specific subset of pitches and plot their strike zone location.

The Hardball Times published a pitch identification tutorial by John Walsh that is a good introduction to the general PITCHf/x topic as well as the specific topic of pitch identification.

Dr. Alan Nathan’s Physics of Baseball site has a lot of interesting resources, including some PITCHf/x-related material.

4. How do I identify pitch types?

Some people are good at identifying pitch types while at the ballpark or from the center field TV camera view. That was a splitter. That was a sinker. That was a slider. Etc. I am not one of those people. If you are not one of those people either, PITCHf/x was made for you. Even if you are one of those people, PITCHf/x can be a useful resource for learning about how different pitches move.

A pitcher’s fastest pitch is usually a four-seam fastball. A typical major-league fastball is around 90 mph, many a little faster, some a little slower. The fastball from a right-handed pitcher breaks in toward a right-handed hitter. Pitches from a lefty move the opposite way; a fastball from a lefty breaks away from a right-handed hitter. I’ll describe the movement for pitches from a righty and you can flip the orientation if you want to know how a similar pitch from a lefty would behave.

Pitchers throw variations of the fastball by changing the grip on the baseball or parts of their motion and delivery. The most popular variation is a two-seam fastball, which often thrown a couple mph slower and breaks in more and drops more to a right-handed hitter from a right-handed pitcher than the four-seamer. The cut fastball is also thrown a few mph slower than the four-seamer and breaks away a little from a right-handed hitter, if it breaks at all.

The most popular off-speed pitch is the changeup, which is typically thrown 7-10 mph slower than a pitcher’s fastball. It usually has a similar break to the fastball, in toward a right-handed hitter. Some pitchers employ a grip on their changeup to impart additional movement, usually causing the pitch to break in more and drop more to a right-handed hitter. The split-finger fastball acts much like a changeup except that its velocity and movement are usually somewhere between the fastball and changeup.

Breaking balls include the slider and curveball. The slider is usually thrown at the same speed as the changeup or sometimes a few mph faster. The movement on the slider can vary quite a bit from one pitcher to another. Some sliders move like a cutter, with hardly any left-right break. Other sliders move more like a curveball, which breaks away from a right-handed hitter and down. The curveball is the slowest pitch, thrown in the 65-80 mph range in major league baseball.

The knuckleball is a special case in major league baseball these days. As far as I know, there were only two regular practitioners of the pitch in the majors last year: Tim Wakefield and Charlie Haeger. The pitch is thrown with very little spin such that the airstream interaction with the seam orientation causes the baseball to move unpredictably. Wakefield and Haeger throw the knuckleball about 65-70 mph.

Of course, there are a number of variations and combinations of the above pitches and specialty pitches like the screwball and gyroball and even the 50-mph Orlando Hernandez eephus pitch.

Here is a plot showing the typical vertical and horizontal spin deflection (a.k.a.”break”) of typical pitches from a right-handed pitcher, as viewed from the catcher’s point of view. A mirror image would give you the plot for left-handed pitcher. You can use this as a key for interpreting some of the graphs on Josh Kalk’s player cards or for understanding the spin-induced movement on various types of pitches.

5. How do I interpret graphs?

PITCHf/x analysis and research is a promising field with wide application and broad interest, and there are a number of people who have made important contributions in the first year of analysis. As a result, there are many different formats for presenting the results. I’ll summarize and explain a few of them here and give a more detailed explanation of some of the graphs that I use most frequently.

The most common plots presented by other PITCHf/x researchers include information about the speed and spin-induced deflection of pitches. To the best of my knowledge, Joe Sheehan was the first to produce these plots, showing speed on the vertical axis and the two components of spin deflection as two sets of points on the horizontal axis. Joe hasn’t done much pitch classification work recently, but he deserves a nod as the groundbreaker in that field.

Something you’re more likely to encounter these days is a plot from John Walsh, such as those contained in his pitch identification tutorial. He plots vertical “movement” versus horizontal “movement”, where movement refers to the spin-induced deflection, and indicates speed by color-coding the points on the graph.

Most common of all are the plots from Josh Kalk’s pitcher cards, particularly the plots of vertical “break” versus horizontal “break”. These are similar to John Walsh’s plots except that instead of color-coding for speed, the points on the graph are color-coded by pitch type. Josh has separate graphs that plot speed versus horizontal break and speed versus vertical break, reminiscent of the original Sheehan plots. Josh’s player cards also contain information on release point, which is the height and left-right position of the pitch measured 50 feet from home plate, which is soon after the actual release by the pitcher.

In the past I have presented graphs similar to those of Sheehan and Kalk, but more recently I’ve adopted a graph from Alan Nathan as my mainstay. It is a polar plot, with the speed of the pitch on the radial axis. The faster the pitch, the farther from the center. The slower the pitch, the closer to the center. The angle is the angle of the Magnus force, which is the force that cause the ball to break. Curveballs break down, so they’ll be in the bottom part of the graph. Sliders break away from a right-handed hitter, so they’ll be on the left side of the graph. The Magnus force of a fastball pushes the ball up, causing it to drop less than it normally would due to gravity alone, so the fastballs will be on the top part of the graph.

I’ve also started showing a graph of what I call “late break”, which is a combination of the effects of spin deflection and gravity as well as the speed of the pitch. The goal is to show something close to what the hitter perceives as the break or movement of the pitch. I calculate the deflection of the pitch due to two forces, spin and gravity, in the last 0.25 seconds of its trajectory before it crosses the plate, an idea I got from Tom Tango. I chose a quarter second because that’s roughly the reaction time of a batter executing a swing. I chose to include the effect of gravity because I believe that more accurately reflects what hitters see. Hitters don’t attempt to hit a gravity-less pitch; they attempt to hit a pitch that’s being affected by gravity and being deflected by spin.

6. Is the data reliable?

Whenever you are viewing or analyzing PITCHf/x data, it’s worth keeping in my mind that 2007 was a work in progress for Sportvision and MLBAM. They instituted the system in only a handful of stadiums to begin the year and added more systems in other stadiums, particularly in the second half of the year, as they gained confidence in the performance and accuracy of PITCHf/x. They experimented with measuring the initial point of the pitch trajectory at various distances from home plate, finally settling on 50 feet. They worked to identify and remove spurious data that was collected by the system. They trained operators who did such things as identifying the beginning of play in each half inning and setting the top and bottom of each batter’s strike zone in the system. In addition, the camera systems were sometimes recalibrated, possibly at the beginning of each home stand.

So it’s a bit naive to assume the data we have is a perfectly objective, accurate, and precise measure of each pitch. In most cases, it’s pretty close (within an inch or two) and good enough–much better than anything we’ve ever had before! But what are some of the sources of error to watch out for?

The data for some pitches is missing. In some cases this is obvious, when a stadium doesn’t have a system for part of the year, for example. Other times, portions of games will be missing, or even just individual pitches. Perhaps the operator may not have turned the system on for the first pitch of the inning, or MLB/Sportvision retroactively discovered an error in their data and removed it. We are also missing PITCHf/x data for all hit batsmen during the regular season.

There is erroneous data–spurious or mis-measured pitches. For example, the data may say that a pitch was released from ten feet off the ground, and unless Gumby has caught on with a major league team, I doubt any pitcher can reach that high. There are a number of 30-40 mph pitches that are recorded in the data that do not appear to be realistic. It’s been suggested that some of these may have been the system inadvertently recording other non-pitch throws of the baseball between the mound and the plate as a pitch.

There are indications of park and/or camera system bias. Data from Seattle and Toronto indicate pitch speeds that seem a few mph higher than they should be. Look how hard Dustin McGowan and Felix Hernandez are shown to have thrown on average. These guys are hard throwers, but not that hard. Similarly, the system at Fenway Park seems to have underestimated pitch speeds and otherwise collected strange data.

There are also altitude and temperature effects. In this case, the data collected by PITCHf/x may be completely correct, but our interpretation of the data has to take into account that air density affects how a pitched baseball moves. A curveball thrown in the thin air of Denver, Colorado won’t break as much as the same curveball thrown in the pea soup at sea level.

7. Where can I go for further discussion and study?

If you want to learn more about the details of Sportvision’s PITCHf/x system and MLB’s implementation, read this article by Mark Newman of MLB.com.

If you want to learn more about the physics of pitched baseballs, Alan Nathan is your man, and his freshman physics lectures on the Physics of Baseball at the University of Illinois are an excellent place to begin. You might also find these articles by Dave Baldwin and Terry Bahill helpful.

If you want to learn more about pitch classification methods, as I mentioned earlier, John Walsh’s pitch identification tutorial is a good place to start. You may also want to consult my survey of the topic, which contains a particular in-depth emphasis on my own work on the subject.

If you want to discuss PITCHf/x with other sabermetricians, I recommend The BOOK Blog run by Tom Tango.

If you want to learn about systematic error correction for the PITCHf/x data set, read Josh Kalk’s posts at his blog, and this post by Ike Hall, including comments by Alan Nathan.

If you want to learn about pitch sequencing analysis, Joe P. Sheehan’s Command Post at Baseball Analysts is a good resource, including these posts on the topic. Joe Sheehan’s writing is an excellent resource on a number of diverse PITCHf/x topics. Although I only listed him here under pitch sequencing, it’s well worth going through his archives on many other topics if you are interested in learning about PITCHf/x.

Dan Fox’s work is another great PITCHf/x resource, although, like Joe, I couldn’t find a neat category to file him under. He’s covered everything from pitch classification to measures of strike zone judgment.

If you want to learn about pitching styles, strategies, and repertoires throughout baseball history, I highly recommend reading the Neyer/James Guide to Pitchers, published in 2003. Rob Neyer has updates to the book at his blog.

As he does every year, Tom Tango is compiling the Fans’ Scouting Report. He is seeking help from baseball fans to rate the defensive abilities of the players they have watched this season.

Baseball’s fans are very perceptive. Take a large group of them, and they can pick out the final standings with the best of them. They can forecast the performance of players as well as those guys with rather sophisticated forecasting engines. Bill James, in one of his later Abstracts, had the fans vote in for the ranking of the best to worst players by position. And they did a darn good job.

There is an enormous amount of untapped knowledge here. There are 70 million fans at MLB parks every year, and a whole lot more watching the games on television. When I was a teenager, I had no problem picking out Tim Wallach as a great fielding 3B, a few years before MLB coaches did so. And, judging by the quantity of non-stop standing ovations Wallach received, I wasn’t the only one in Montreal whose eyes did not deceive him. Rondel White, Marquis Grissom, Larry Walker, Andre Dawson, Hubie Brooks, Ellis Valentine. We don’t need stats to tell us which of these does not belong.

What I would like to do now is tap that pool of talent. I want you to tell me what your eyes see. I want you to tell me how good or bad a fielder is. Go down, and start selecting the team(s) that you watch all the time. For any player that you’ve seen play in at least 10 games in 2009, I want you to judge his performance in 7 specific fielding categories.

If you’ve watched a lot of baseball in 2009, or at least enough to meet the guidelines, please participate in compiling this valuable resource.

I have a couple scouting reports up at the Hardball Times based on data from PITCHf/x, one on Scott Kazmir and the other on Cole Hamels.

I also highly recommend Matt Lentzner’s article at THT on his theory of pitching mechanics.

I’ve been doing a few other things behind the scenes that haven’t seen publication here or at THT, but I’m still involved in baseball analysis and writing, in case you were wondering.  You can look for my article on Cliff Lee in the upcoming Hardball Times Annual 2009, which will be available November 30.

As I write this, the All-Star game goes to the 15th inning, and I go to bed. I’ll check the box score in the morning.

Over at the Hardball Times, I take a look at the bases loaded, two out, bottom of the ninth, one run lead situation going back to 1956.

It’s mostly just a fun historical research article, with some numbers gathered from Retrosheet and the Baseball-Reference Play Index and a dash of PITCHf/x at the end for flavor.