My article at Hardball Times on Danny Herrera’s screwball includes views of his pitch trajectories as seen from the right-handed and left-handed batter’s boxes.

I mentioned in the References section that I did some trigonometry to transform the coordinate system from plate view to batter’s box view.

Here is what I did.

The pitch trajectory is shown as the dotted black line. Any point on the trajectory can be calculated using the initial position, velocity, and acceleration provided in the PITCHf/x data, along with the equations of motion. Only the x-y plane is shown above since no transformation was done to the z axis. The coordinates in the PITCHf/x coordinate space are x and y, shown in black.

The coordinates in the batter’s box view are x’ and y’, shown in red. The y-axis in the batter’s box view runs along a line from the batter’s head to the pitcher’s approximate release point (the average x value of his pitches at y = 55 feet). The x-axis in the batter’s box view is set perpendicular to this new y-axis.

The origin of the batter’s box view is offset 2.8 feet in the x direction from the origin in PITCHf/x coordinate space. I calculated 2.8 feet from the center of the plate as the approximate location of the batter’s head, based on a video frame capture in Marv White’s presentation at the PITCHf/x Summit. I chose not to offset the origin in the y direction for simplicity, although I also believe this does not introduce any significant inaccuracy. The batter’s head is typically within a foot or so of y=0.

First, I calculated the quantity m, the distance to the baseball, shown by the blue line. This distance m = sqrt ( y^2 + ( x + 2.8 ft)^2 ).

Next, I found the value of the angle alpha. The angle alpha = arctan ( 55 ft / ( x0 + 2.8 ft) ).

The angle (alpha – theta) = arctan ( y / ( x + 2.8 ft) ), which allows us to calculate the angle theta.

The angle theta = arctan ( 55 ft / ( x0 + 2.8 ft) ) – arctan ( y / ( x + 2.8 ft) ).

The batter’s box coordinates x’ and y’ can be found from the angle theta and the distance m. The new y’ = m * cos (theta), and the new x’ = m * sin (theta).

I am happy for you to use my method for batter’s view transformation if you provide attribution in the form of my name and/or a link to this website.


I have two new articles up at the Hardball Times.

The first is a short article on THT Live breaking down Francisco Liriano’s April 13 start against the Kansas City Royals.

The second is an article examining the ways in which ball tracking technologies like PITCHf/x are changing the game and what kinds of analysis are possible with this new data. It’s an expansion on my opening day laundry list of ideas that I posted here.

I posted a brief evaluation of the MLBAM pitch classification algorithm on the THT Live blog. So far I am not impressed with the system, but maybe there is hope for some improvement.

Update 4/11: Dan Fox reports that some improvements have been instituted for the MLB classification system this week. I’m in the process of taking a look at some data for a few other pitchers. This new data set includes a few starts from Thursday, April 10, which I believe should be covered under the improved algorithm that incorporates information about the pitches in a pitcher’s repertoire. I’ll report back if and when I learn something from this study.

I’ve posted an introduction and tutorial on various PITCHf/x topics at MVN.

  1. What is PITCHf/x?
  2. How do I get and use the data?
  3. Where can I find resources?
  4. How do I identify pitch types?
  5. How do I interpret graphs?
  6. Is the data reliable?
  7. Where can I go for further discussion and study?

Once again building on pitch identification work I’ve done for a pitcher, here is Part 2 of the series on Joba Chamberlain. It’s not exactly all I hoped, for reasons I’ll get to in a moment, but there are some interesting things to be learned. This is similar to previous work I’ve done for Josh Beckett and Eric Gagne.

First, let’s look at which pitches Chamberlain uses in various ball-strike counts.

Count Fastball Slider Change Curve #Pitches
0-0 71% 24% 1% 4% 79
0-1 58% 26% 3% 13% 38
0-2 41% 59% 0% 0% 17
1-0 88% 12% 0% 0% 34
1-1 65% 32% 3% 0% 31
1-2 39% 61% 0% 0% 23
2-0 89% 11% 0% 0% 9
2-1 65% 29% 0% 6% 17
2-2 21% 68% 0% 11% 19
3-0 100% 0% 0% 0% 2
3-1 100% 0% 0% 0% 3
3-2 50% 50% 0% 0% 10
Ahead 49% 44% 1% 6% 78
Even 62% 33% 2% 4% 129
Behind 79% 20% 0% 1% 75
0 strikes 77% 19% 1% 2% 124
1 strike 63% 28% 2% 7% 89
2 strikes 36% 61% 0% 3% 69
Ball 0-1 65% 30% 1% 4% 222
Ball 2-3 55% 40% 0% 5% 60
All 63% 32% 1% 4% 282

Chamberlain Pitch Mix by Count

Joba Chamberlain definitely relies on his fastball, which is probably not unusual for a power pitcher out of the bullpen, but he throws his slider much more often with two strikes. In a 2-2 count, you can almost expect a slider (68%). I don’t think we have enough data on his use of the curveball to draw conclusions about that. You can compare my data to Josh Kalk’s, although my data set includes Chamberlain’s two divisional series appearances, and Josh’s algorithm classifies all of Chamberlain’s off-speed pitches as sliders, whereas I have identified his curveball and changeup separately.

Next, let’s look at the results split up by pitch type and batter handedness.

Fastball 38 22 14 3 13 7 10 0.350 0.500 61% 92%
Slider 13 3 5 14 3 0 0 0.000 0.000 66% 36%
Changeup 2 0 0 0 0 0 0 0%
Curveball 5 2 0 2 0 0 0 44% 0%
  58 27 19 19 16 7 10 0.304 0.435 60% 69%
Fastball 27 15 18 7 9 4 7 0.308 0.538 66% 82%
Slider 18 8 5 18 4 0 0 0.000 0.000 66% 33%
Changeup 0 0 0 0 1 0 0 0.000 0.000 100% 100%
Curveball 1 1 0 0 0 0 0 50%
  46 24 23 25 14 4 7 0.222 0.389 66% 62%

CS=called strike, SS=swinging strike, IPO=in play (out), IPNO=in play (no out), TB=total bases, BABIP=batting average on balls in play (including home runs), SLGBIP=slugging average on balls in play (including home runs). For Strk% all pitches other than balls are counted as strikes. Con% = (Foul+IPO+IPNO)/(Foul+IPO+IPNO+SS).

The first thing that jumps out is, of course, the results for his slider. Wow! Just wow. In the PITCHf/x games, at least, nobody got a hit off of it, and hardly anybody managed to put it into play or even foul it off. The only real negative would be that it seemed like he had a little trouble throwing his curveball for strikes, but given that he only walked nine men in 27 and 2/3 innings, that doesn’t seem a big concern.

Next let’s look at the strike zone charts showing where Joba Chamberlain locates his pitches against left-handed hitters and right-handed hitters. I’m keeping the same formatting for these charts as I did in the Beckett and Gagne analyses. The strike zone is shown as a box, including one radius of a baseball on each side of the plate, and the top and bottom of the zone are a general average not adjusted per batter in these charts. The location is plotted where the pitch crossed the front of home plate.

Let’s begin with the fastball.

Chamberlain Fastball Results

Chamberlain works mostly on the outer half of the plate with the fastball to lefties, and he’s more in the zone to righties, although he also comes up and in to righties. Batters seem to be able to handle his fastball fairly well, not swinging and missing very often and having pretty good success when they do put the ball in play, similar to what we saw with Josh Beckett’s four-seam fastball. I don’t have a good idea yet how this compares to league-wide numbers for all pitchers’ fastballs or even to a significant number of other hard throwers.

Next, let’s look at the seldom-used curveball and changeup. I’ll present these without comment since there isn’t much data to discuss.

Chamberlain Results for Curveballs

Chamberlain Changeup Results

Finally, let’s move on to what you’ve all been waiting for: the famous Joba slider.

Chamberlain Slider Results

This is where this avenue of inquiry starts to go downhill. After looking at this graph, I wanted to talk about how Chamberlain gets a lot of swings and misses on his slider down and away to righties and down and in to lefties.

But I was bugged by the swinging strike that was recorded nearly at the lefty batter’s foot (x=1.83, z=0.35). Was a hitter so badly fooled by a slider that he swung at one at his shoe top? It’s certainly possible, but if so, I wanted to see it. So I brought up the footage for the game, September 23rd against Toronto, where Chamberlain entered with two on and two out in the 8th inning to face left-handed Adam Lind, trying to preserve a 7-5 Yankees’ lead. Jumping to the end of the story, Chamberlain throws Lind five straight sliders to strike him out and end the inning.

Unfortunately, however, the pitch locations recorded by PITCHf/x for these pitches were mistakenly attached to the wrong pitches in the Gameday XML data.

Chamberlain Lind PITCHf/x at bat

Chamberlain Lind Actual at bat

The first pitch of the at bat was a belt-high slider just inside that Lind swung at and missed, followed by a second pitch in almost the same location, with the same result. Next, Chamberlain threw two sliders at Lind’s feet; the second of these landed in the dirt. Lind laid off both of those pitches to even the count at 2-2. Finally, Chamberlain threw a slider down and in, labeled pitch #5 in the second graph, which Lind swung at and missed for strike three.

The XML pitch location data for this game seems to have missed the fourth pitch (the one in the dirt) altogether and added an extraneous pitch, labeled #3 in the first graph, that did not occur in the pitch sequence to Lind. Then the order of the other pitches is out of whack, too. The pitch labeled #1 should be #5, #2 should be #1, #4 should be #2, and #5 should be #3.

The conclusion is that, no, Chamberlain did not get Adam Lind to swing at slider at his shoe tops. He did get him to swing at a pitch down and in that would have been Ball 3 if he let it go by, and it was an impressive pitching performance by Chamberlain, but unfortunately it calls into question the integrity of our data set.

I don’t have any way to verify the integrity of the rest of the data without watching endless hours of games on That may seem like a worthy endeavor to some, and I can’t argue too strenuously with them, but alas, the rest of my non-baseball life seems to think it has some importance, too.

I don’t intend my notation of this example in any way to disparage the incredible work that MLBAM and Sportvision have done in creating this data set and making it available to us. For free, no less. It’s an incredibly valuable resource, and some errors are to be expected during a season in which the system was being evaluated and debugged.

I just don’t know how prevalent these kinds of errors are and when they might call into question some of my conclusions. I do know that Eric Van spotted a similar error in Josh Beckett’s data from Game 1 of the division series, as detailed in this thread at Sons of Sam Horn, post #88. The PITCHf/x data in question for that game has since been removed from the data set altogether. Eric mentions plotting the human-generated x,y coordinates against the computer-generated PITCHf/x coordinates as a way to spot these errors, but in our case with the Chamberlain-Lind at bat, the human-generated coordinates look screwy to me, too. I haven’t applied Eric’s method to a larger data set, so it may still have merit.

While, we’re on this subject, I may as well put in a plug for Josh Kalk’s new PITCHf/x batter-pitcher matchup tool. You can look at the Chamberlain-Lind matchup there for yourself. It doesn’t tell you anything I didn’t show here, but I wanted to make sure all my readers were aware this great tool was available.

Update: Cory Schwartz from MLBAM addresses the PITCHf/x data error here.

Back in August when I was first getting started with PITCHf/x analysis, I took a quick look at a young and highly-touted rookie who had just broken into the big leagues. Sportvision had not yet brought the PITCHf/x system online at his home park, but we had data from one relief appearance this pitcher made on the road, and I used that to get a quick and dirty read on his pitches.

Oddly enough, that quick analysis has been the source of more search engine hits than any other pitcher analysis on my blog so far. That may have something to do with his team, the New York Yankees. I’m starting to feel like I only write about Red Sox or Yankees around here, and today I’m going to continue pandering to the masses with this update on Joba Chamberlain.

When I looked at Chamberlain’s two-inning appearance in August, he was mainly a fastball-slider pitcher with possibly a couple changeups in that outing, and his fastball was hitting the upper 90’s. With a full season’s data, the basic picture remains the same: upper 90’s fastball, hard-breaking slider. But it looks like he’s relying more on a curveball as an off-speed pitch to lefties, and his changeup has hardly been seen since. In addition, we have enough data to look at usage patterns for his different pitch types and the results he gets from each of them, although that may have to wait until a separate article.

Let’s start by identifying his pitch types. Regular readers will know by now that I like to begin this process by graphing pitch speed versus the direction of the spin axis, which determines the direction the pitch will move due to spin. With some Excel help from Tom Tango, I’m going to try this in a bit different format, one that hopefully makes more intuitive sense to the reader as opposed to the hard-core PITCHf/x researcher. I’m putting the data on a polar plot, showing it from the pitcher’s viewpoint, and graphing the direction of the spin force rather than the direction of the spin axis.

Chamberlain Pitch Speed vs. Spin Force Direction

The backspin on a fastball causes it to “rise”, i.e., drop less than a pitch without backspin. The sidespin on a slider makes it break away from a right-handed hitter, and the topspin on a curveball makes it drop more than normal. Hopefully that is clearer from the above polar plot than from the old way I presented this information, which I’ll show below for the sake of comparison (the angles are different).

Chamberlain Pitch Speed vs. Spin Axis Direction

I’d appreciate feedback on whether the new graphing method is easier to understand or any other comments or suggestions you may have.

Joba Chamberlain’s main pitch is a 95-100 mph fastball, delivered roughly from the 1 o’clock position. His fastball has a little bit of cutting action, moving away from a right-handed hitter more than a typical four-seam fastball, but I wouldn’t say he’s throwing a cutter, as such.

His second pitch is a slider with a lot of break, running 84-89 mph. Some of his sliders look almost like very hard curveballs. He throws the slider more to righties (39% of pitches) than to lefties (26%), but he definitely relies on it to both.

After the two changeups we saw in Joba’s August 10th appearance in Cleveland, the PITCHf/x system only recorded one more, thrown on August 24th to retire Placido Polanco on a fly ball to center field. Those three changeups were thrown around 83 mph. There’s not much else to say about changeups with that small sample.

Upon revisiting Joba Chamberlain, I was surprised to find him using an occasional curveball, mostly to lefties. His curve looks pretty typical, running 77-80 mph.

We can also look at how fast the pitches spin.

Chamberlain Pitch Speed vs. Spin Rate

What’s impressive about some of those fastballs is that they almost do actually rise–the spin force imparted by 3200 revolutions per minute is almost enough to keep a 99-mph pitch from dropping at all due to gravity. In fact, by my calculations, the rise due to spin came within one inch of counteracting gravity on eight of Joba’s fastballs. That seems impressive to me. I’ve hardly looked at every hard thrower in baseball, and I know J.J. Putz generates some similar numbers, but I don’t think it’s very common.

The other thing to note in this graph is the low spin rate of the slider. In truth, the slider spins much faster, but much of the spin is around the direction of travel (like the spin on a nicely-spiraling football) due to how the slider is thrown. We can’t measure that component of the spin, but fortunately, that’s the component of the spin that also has little effect on how the pitch moves. When I talk about spin rate around here, that’s short hand for the x- and z-components of the spin, that portion of the spin which affects the direction the ball will break. I don’t always mention it, but it bears repeating occasionally. The slower measured “spin” of the slider is often one easy way to differentiate it from a curveball.

Finally, let’s look at the movement on the pitches. This graph shows the movement due to spin (the Magnus force) and gravity, from the perspective of the pitcher.

Chamberlain Pitch Movement with Gravity

Joba Chamberlain’s slider really has amazing break and his fastball has a lot of hop. I can see why he is regarded as a special talent.

With that, I’ll sign off for now. Hopefully I can czech in again soon with the next part of this series, and I’m grateful you gave me a piece of your time.

Note: For those of you who are interested in reproducing this sort of analysis for yourself (or finding errors in my math), you can download the Excel spreadsheet that I used.

Update: You can read Part 2 here.

Building on the pitch identification I did for Josh Beckett, I wanted to dig a little deeper into how he used his pitches and what results he got, similar to how I did with Eric Gagne.

First, let’s look at which pitches Beckett threw in various counts:

Count 4-seam 2-seam Cutter Change Curve #Pitches
0-0 48% 18% 1% 10% 22% 408
0-1 29% 26% 3% 11% 31% 197
0-2 33% 24% 2% 5% 36% 107
1-0 43% 18% 1% 16% 23% 160
1-1 34% 24% 3% 9% 31% 156
1-2 38% 20% 2% 3% 36% 161
2-0 52% 22% 0% 7% 20% 46
2-1 38% 30% 0% 16% 16% 74
2-2 33% 28% 2% 2% 35% 106
3-0 71% 29% 0% 0% 0% 14
3-1 65% 24% 9% 0% 3% 34
3-2 55% 24% 0% 7% 13% 67
Ahead 33% 24% 2% 7% 34% 465
Even 43% 21% 2% 9% 26% 670
Behind 48% 23% 1% 11% 17% 395
0 strikes 48% 19% 1% 11% 21% 628
1 strike 35% 26% 3% 10% 27% 461
2 strikes 38% 24% 2% 4% 32% 441
Ball 0-1 40% 21% 2% 10% 28% 1189
Ball 2-3 46% 26% 1% 6% 20% 341
All 41% 22% 2% 9% 26% 1527

Beckett Pitch Mix by Count

We can see that he used his curveball more often when he got ahead of hitters, and he leaned more on his four-seam fastball over his two-seam fastball when he got behind in the count. I should mention that I’m including post-season and All-Star game data, which is probably one reason my numbers differ a little from Josh Kalk’s.

Now, let’s look at results by pitch type. Here I’ve split the data up by handedness of the batter.

4-seam FB 137 71 79 26 27 16 30 0.372 0.698 62% 82%
2-seam FB 30 20 22 8 21 14 19 0.400 0.543 74% 88%
Cutter 4 1 5 2 1 2 2 0.667 0.667 73% 80%
Changeup 30 5 14 15 13 8 10 0.381 0.476 65% 70%
Curveball 59 45 16 19 16 4 6 0.200 0.300 63% 65%
  260 142 136 70 78 44 67 0.361 0.549 64% 79%

4-seam FB 85 57 60 18 36 16 29 0.308 0.558 69% 86%
2-seam FB 69 36 51 10 41 15 19 0.268 0.339 69% 91%
Cutter 4 1 1 3 1 1 1 0.500 0.500 64% 50%
Changeup 17 12 7 5 5 4 5 0.444 0.556 66% 76%
Curveball 98 64 23 29 22 6 12 0.214 0.429 60% 64%
  273 170 142 65 105 42 66 0.286 0.449 66% 82%

CS=called strike, SS=swinging strike, IPO=in play (out), IPNO=in play (no out), TB=total bases, BABIP=batting average on balls in play (including home runs), SLGBIP=slugging average on balls in play (including home runs). For Strk% all pitches other than balls are counted as strikes. Con% = (Foul+IPO+IPNO)/(Foul+IPO+IPNO+SS).

Next are strike zone charts showing where he locates his pitches against left-handed hitters and right-handed hitters. I’m keeping the same formatting for these charts as I did in the Gagne analysis, but let me know if you have ideas for how I can improve them. The graphics are a little small, but I thought it was more important to contrast the general patterns of lefty versus righty than to see the exact result for a specific pitch.

The strike zone is shown as a box, including one radius of a baseball on each side of the plate, and the top and bottom of the zone are a general average not adjusted per batter in these charts. The location is plotted where the pitch crossed the front of home plate.

Let’s start with the fastballs. First the four-seamer. (As I mentioned in my previous analysis of Beckett, the line between the four-seamer and two-seamer is a hazy one; although I think my distinction is generally accurate, it is unlikely to be accurate for every specific pitch.)

Beckett 4-seam Fastball Strike Zone Chart

Beckett likes to work the 4-seamer away from lefties, and it looks like he gets a lot of foul balls at the edge or just off the edge of the plate. He also gets a lot of balls, mostly outside it looks, and other than the curveball it’s his pitch that gets the least strikes at 62%. Overall, lefties hit the pitch pretty well–a .372 batting average when they put it in play, and with plenty of power. I wish I knew how that compared to other pitchers’ fastballs, but I don’t have those numbers. Clearly, context is important for numbers like these.

Against righties I don’t see a clear inside/outside preference, although he seems to work up in the zone more than down. He’s also more effective at getting strikes with the pitch against righties (69%).

Moving on to the two-seamer…

Beckett 2-seam Fastball Strike Zone Chart

The first thing that jumps out is that he’s almost twice as likely to use the 2-seamer against righties than lefties. Against lefties, he’s in the zone with it a lot, and it gets hit fairly hard. Against righties, it looks to be his most effective pitch, generating a lot of ground balls when he gets it on the inner half of the plate. Against righties, he got 27 ground outs with his two-seamer compared to 14 outs in the air (pop outs, line outs, and fly outs). He also gave up 7 ground ball hits and 8 hits in the air from his two-seamer against righties. Again, I don’t know if those numbers are significant or how they compare to other pitchers.

Beckett’s least-used pitch is the cutter, so the graphs for it are not terribly interesting, but I’ll show them here.

Beckett Cut Fastball Strike Zone Chart

It looks like he mostly works the cut fastball down and in to lefties and up in the zone to righties, but it’s hard to find any meaningful trends in 26 pitches. He struck out Jay Gibbons on a cutter down and in, and…well, I don’t really have anything more to say about the cutter.

Now for a change of pace…

Beckett Changeup Strike Zone Chart

It’s obvious he likes to keep the changeup down and away from lefties, and he gets a lot of swings and misses that way, particularly when he keeps it down. Against righties, he keeps the ball down but works both sides of the plate. He gets quite a few called strikes on the outer half of the plate.

Finally, we come to Uncle Charlie, Beckett’s other favorite pitch and probably his most effective.

Beckett Curveball Strike Zone Chart

Against lefties, Beckett gets a lot of called strikes across the middle of the zone. He comes down and in a lot, and gets a fair number of swinging strikes when he keeps the curve low to lefties. I’m not sure what to think about the curveballs up and away. I thought those might be hanging breaking balls, but I don’t notice anything unusual when I look at how they moved relative to other curveballs. Maybe he was just hoping to drop those pitches into the top of the strike zone since the location data I’m graphing here was measured at the front of the plate.

Against righties, he’s out of the zone a little more, either down and away or up and in. Again, he gets a lot of called strikes in the zone and swinging strikes when he keeps the curveball down–it’s his most effective pitch for missing bats. Even when hitters put the curveball in play, they don’t have much success–a .208 batting average and a .375 slugging percentage.

Boston Red Sox pitcher Josh Beckett put together an amazing postseason performance this year to help the Sox to a World Series victory. He won Game 1 of the Division Series against the LA of Anaheim Angels with a 4-hit shutout. He took MVP honors in the American League Championship Series, winning Game 1 and Game 5 against the Cleveland Indians. He then topped it all off with a Game 1 victory against the streaking Colorado Rockies, who fell to the Red Sox in a four-game sweep. Beckett’s outstanding postseason was merely the cherry on top of a fine regular season which put him in contention for the AL Cy Young Award.

I decided to take a look at Josh Beckett through the lens of PITCHf/x for several reasons. One, he is an outstanding pitcher and as such is interesting to analyze. Two, he’s known as basically a two-pitch pitcher: fastball, curveball. Granted, they must be two very good pitches, but that’s unusual for a top-flight starting pitcher. Three, I was inspired by some of the discussion between Eric Van and Alan Nathan at the Sons of Sam Horn message board after his dominant Division Series shutout. Van and Nathan identified three types of fastballs among Beckett’s pitches in that game, and I was curious if I could or would see the same thing. Finally, I wanted to try out a few new ideas regarding pitch classification, and Beckett seemed like as good a vehicle as any.

First, the scouting report on Beckett. As I mentioned earlier, Beckett is basically a two-pitch pitcher. He has a mid-nineties fastball and an excellent curveball. He also mixes in a changeup and supposedly has a slider. To quote Dugout Central, “Beckett hasn’t used his slider much in the second half. It’s an inconsistent pitch with average break to it.” In the 1530 pitches recorded by the PITCHf/x system, about half the pitches Beckett threw this year and weighted more heavily toward the second half, I find no evidence that Beckett threw even a single slider.

Without further ado, I’ll present my pitch speed versus spin direction graph showing Beckett’s pitch types. (For more detail about how I derive those parameters, read my article about Jonathan Papelbon.) Then, I’ll discuss how I arrived at the pitch classification shown. Pitch speed is in mph, measured 50 feet from the plate, and spin direction is the direction of the spin axis of the pitch in degrees, as seen by the batter and catcher, with the direction of corresponding spin-induced movement to a right-handed batter noted at the bottom of the graph.

Beckett Pitch Speed vs. Spin Direction

Beckett’s curveball is easy to identify. It’s a classic curveball, running 75-80 mph. I won’t spend much more time on the curveball, since for pitch classification purposes it’s typically the easiest pitch to identify, and Beckett’s is no exception. I will note the absence of sliders on the graph. The closest Beckett comes is a few slurvy curveballs, but given their slow speed and sub-110-degree spin direction, I don’t see any reason to try to split them off as a separate pitch type. They move like normal curveballs from any other pitcher.

Separating the changeups from the fastballs is where it starts to get a little interesting. Beckett doesn’t throw many of them, less than 10% of our sample, totaling 135 pitches. Speed-wise, his changeup shows quite a bit of variation, ranging from 83-93 mph. Clearly the slower pitches are changeups, but as we get up around 90 mph and into the low nineties, how can we tell the changeups from the fastballs? In Beckett’s case, it’s quite helpful to look at the data on a start-by-start basis. Almost every start has a clear separation of pitches into three groups by speed, with the changeups in the middle group.

Beckett Pitch Speed vs. Time

I’ve highlighted the middle speed group in red. The x-axis in this graph is pitch sequence recorded by PITCHf/x throughout the season, which roughly corresponds to time, such that each of Beckett’s starts is compressed into a vertical line of the pitches that were recorded at various speeds. With the exception of a few pitches, it’s quite easy to separate the fastballs, changeups, and curveballs using this graph alone.

On a side note, I don’t know the reason for the almost 6-mph variability in Beckett’s top fastball speed from start to start. Perhaps an avenue for further investigation, possibly already covered by Joe P. Sheehan’s article or Josh Kalk‘s error correction, in any case, an avenue I will bypass for now.

If you adjust the pitch speeds by normalizing the average fastball speed for each start to the overall average fastball speed and adjusting other pitches from that start by the same amount, you get the following pitch speed vs. spin direction graph:

Beckett Normalized Pitch Speed vs. Spin Direction

You can see that within a given start, Beckett’s fastball speed is very consistent in the (normalized) 93-97 mph range. This is true for all three types of fastballs, which is quite unusual. Typically, a pitcher’s two-seam fastball will be a few miles per hour slower than his four-seamer, and a cutter may be even a few mph slower than that. For those pitchers, pitch speed can be an important clue in determining pitch type. That’s one reason that the pitch speed vs. spin direction graph is such a favorite of mine.

For Beckett, however, pitch speed is nearly useless in determining what types of fastballs he throws. In Van and Nathan’s Sons of Sam Horn discussion, they identify three types of fastballs: a four-seamer, a two-seamer, and a cut fastball, and in that October 3rd start the three types are fairly readily identifiable. In many of his other starts, however, the differences among them are not as obvious.

We can see a hint of the three groupings in the normalized pitch speed vs. spin direction graph, but it’s far from clear. The spin rate vs. spin direction graph proves to be a little more helpful. Spin rate is shown in revolutions per minute (rpm).

Beckett Spin Rate vs. Spin Direction

In order to see what’s going on with the fastballs, let’s zoom in on that section of the graph.

Beckett Spin Rate vs. Spin Direction for Fastballs

The divisions between the fastball groupings are somewhat arbitrary, but I believe they are generally well representative of reality. There is a dense cluster of pitches in the middle of the graph between 210-220 degrees, which is an appropriate spin direction for a four-seam fastball from a pitcher with a 1 o’clock delivery. Two-seam fastballs should have a greater spin direction, reflecting the sidespin commonly applied to the ball using the two-seam grip. The cluster of pitches around 230-240 degrees appear to be two-seamers. Finally, there is a small tail of pitches with spin direction of less than 200 degrees and slightly lower spin rates. These appear to be the cut fastballs. Some of them can clearly be identified as cutters if you look at the pitches on a start-by-start basis. You might argue a little with the exact delineation between pitch types here, but I think I’ve nailed it pretty closely. I can’t imagine try to hit a 95-mph fastball from Beckett while trying to decide whether it would hop, sink, or cut.

Finally, I wanted to add in a couple novelties and solicit your feedback on whether these new methods of presenting the data are helpful. These ideas mainly have their genesis in various discussions about the PITCHf/x data presentation topic at Tom Tango and Mitchel Lichtman’s The Book blog.

First, here is a graph of pitch time vs. spin direction. The horizontal axis is the same as for most of the graphs above. The vertical axis shows the time, in seconds, for the pitch to travel from shortly after the pitcher’s release point until the ball crosses the front of home plate. (In the PITCHf/x coordinate system, this is from the initial measurement point at y=50 feet to the final measurement point at the front of home plate at y=1.417 feet. The origin of the coordinate system is at the point of home plate, and the plate is 17 inches, or 1.417 feet, deep.) You could make a graph showing the time for the pitch to travel any other distance you wanted to see. The main point of this graph is to illustrate what the pitches look like on a time scale rather than the more familiar speed scale. Just over a third of a second to hit a Beckett fastball–wow!

Beckett Pitch Time vs. Spin Direction

Next, I revisit the traditional PITCHf/x vertical and horizontal “break” parameters, pfx_x and pfx_z, or as Tom Tango has at least temporarily convinced me to call them, the horizontal and vertical spin movement. This graph shows the spin-induced movement of the pitches, in inches, between the y=40 feet point and the front of the plate, from the perspective of the batter/catcher.

Beckett Vertical Spin Movement vs. Horizontal Movement

For comparison, take a look at Josh Kalk’s algorithmically generated player card for Josh Beckett. He has a lot of good data there that you may find interesting, although he does not separate Beckett’s fastballs by type. But the main point of the preceding graph was to contrast with a graph which Tango and MGL have asked for, the spin plus gravity movement, i.e., the deviation of the pitch from a straight line, which I present below.

Beckett Vertical vs. Horizontal Spin plus Gravity Movement

This graph includes all the information of the traditional pfx_z vs. pfx_x graph, but it also shows that slower pitches drop more because the force of gravity has longer to act on them. Some of Beckett’s curveballs break down almost three feet when the effect of gravity is included.

That’s all for now. I hope to take a further look at some of this data when I get a chance.

Here’s a link to Part 2 of my Josh Beckett analysis.

Building on the previous pitch identification work I did for Eric Gagne, let’s take a look at how he uses his pitches.

Here are some strike zone charts showing where he locates his pitches against left-handed hitters and right-handed hitters. I’m experimenting with the formatting of these charts, so let me know if there are things I can do to improve that. The graphics are a little small, but I thought it was more important to contrast the general patterns of lefty versus righty than to see the exact result for a specific pitch.

The strike zone is shown as a box, including one radius of a baseball on each side of the plate, and the top and bottom of the zone are a general average not adjusted per batter in these charts. The location is plotted where the pitch crossed the front of home plate.

Let’s begin with the curveball.

Gagne Curveball Location

Gagne faces about an equal number of lefties and righties, so the first thing we can see is that he uses his curveball much more against righties (19% of pitches to RHH) than against lefties (11% of pitches to LHH).

He works away from righties and in to lefties with the curve and has pretty good success getting righties to chase the curve down and away.

Next, let’s look at fastballs.

Gagne Fastball Location

Gagne throws approximately equal numbers of fastballs to lefties (56%) and righties (51%). He likes to work away to both lefties and righties but seems a little more willing to come inside to lefties. He gets a lot of foul balls and contact in the zone with his fastball and, other than a few high pitches, not a lot of balls chased out of the zone.

Moving on to changeups…

Gagne Changeup Location

Gagne throws the changeup about equally to lefties (30%) and righties (25%). He clearly likes to keep his changeup down and away to both lefties and righties. It looks like he has pretty good success with that, but when he gets up in the zone with the change, it starts to get hit, particularly against lefties. We’ll see in a moment if the numbers bear that out, but the strike zone charts certainly look that way.

Finally, let’s look at the sliders.

Gagne Slider Location

Gagne doesn’t throw many sliders, 2% of pitches to lefties and 5% of pitches to righties. The few he throws to lefties are inside, same as with the curveball, perhaps moreso. With righties he works down and away with the slider, mostly missing the zone. It looks more like a show-me pitch than anything.

It would be interesting to learn on what count he throws the slider, or any of the other pitches for that matter, but I haven’t compiled that information yet.

Here are the results by pitch type and batter handedness in tabular format.

Fastball 66 38 31 10 17 5 7 0.227 0.318 60%
Changeup 31 10 12 17 9 11 13 0.550 0.650 66%
Curveball 15 9 4 2 4 0 0 0.000 0.000 56%
Slider 2 0 2 1 1 0 0 0.000 0.000 67%

Fastball 44 22 45 9 23 11 17 0.324 0.500 71%
Changeup 28 8 14 17 6 3 6 0.333 0.667 63%
Curveball 23 15 7 7 2 2 2 0.500 0.500 59%
Slider 10 2 2 1 0 1 1 1.000 1.000 38%

CS=called strike, SS=swinging strike, IPO=in play (out), IPNO=in play (no out), TB=total bases, BABIP=batting average on balls in play (including home runs), SLGBIP=slugging average on balls in play (including home runs).  For Strk% all pitches other than balls are counted as strikes.

Clearly, my earlier hunch was right: Gagne’s changeup gets tattooed when he gets it up in the zone against lefties. Also, it does look like the slider is mainly a show-me pitch to righties.

As the regular season ends and the playoffs approach, I’m looking at a few of the playoff-bound pitchers, and I want to share the results for one of those pitchers–Eric Gagne.

I begin with identifying his pitch types, and as time permits, I’ll move on from there in further posts. I was able to identify four main pitch types that Gagne has thrown this year in the 39 games (of 54 total) for which we have PITCHf/x data recorded.

The graph I find most helpful in quickly identifying pitch types is pitch speed versus spin direction. For more detail on the methodology, read my post on Gagne’s teammate Jonathan Papelbon.

Gagne Pitch Speed vs. Spin Direction

Gagne’s pitch mix is about 53% fastballs, 28% changeups, 15% curveballs, and 4% sliders.

His fastball looks like a classic four-seamer delivered from about 1 o’clock and running 89-95 mph. I don’t see any evidence of a two-seam fastball, but he could probably hide a handful of them in there without me being able to spot them as a unique pitch.

His changeup is interesting. He calls it a Vulcan changeup because of the V grip he uses, and there are some definite similarities to a split-finger or forkball pitch in terms of the significant sidespin, inclined about 50 degrees more than his fastball. Speed-wise his changeup ranges from about 80-87 mph.

His other major pitch is a curveball, hitting about 67-73 mph with good topspin, and also thrown from about a 1 o’clock delivery.

His occasional slider seems very inconsistent. It runs about 82-86 mph, but its spin axis is all over the place, ranging from 120 degrees (great sidespin) to 210 degrees (no sidespin at all other than that from the 1 o’clock delivery).

There are five pitches out of total of 601 in our PITCHf/x dataset for Gagne that I could not classify into the aforementioned four pitch types. Three of them appear to be data collection errors based on unrealistic release points, and I’ve eliminated them from the dataset. Before I discuss the other two pitches further, here are two additional graphs: pitch speed vs. spin rate and spin rate vs. spin direction.

Gagne Pitch Speed vs. Spin Rate

Gagne Spin Rate vs. Spin Direction

On these two graphs you can see two unidentified pitches as well as additional details about the four main pitch types.

I’ve tentatively labeled the two unidentified pitches as a slurve and a forkball. The “slurve” pitch was thrown with a speed on the borderline between the curve and slider groupings. Its spin direction makes it look almost like a curve, but its slow spin rate makes it look almost like a slider. My best guess is that Gagne was attempting to throw a curveball but gave it a little more slider action than normal.

The pitch I’ve labeled “forkball” looks quite a bit like his other changeups except for the fact that it has a spin rate of only 500 rpm, and that’s reminiscent of a forkball or split-finger pitch. It doesn’t quite fit with the sliders given its spin direction of 224 degrees. We’ve already seen that Gagne is inconsistent with the amount of sidespin he gets on his slider, but this would be sidespin in the wrong direction for a slider. Given his changeup grip, it wouldn’t surprise me to see him throw a changeup that looks pretty forkball-ish.

If you want to compare my data with the work of others, you can check out the player card that Josh Kalk generated for Eric Gagne using his clustering algorithm and data normalization. Below is my graph of vertical “break” vs. horizontal “break” with the pitch types labeled according to my classification. Josh’s algorithm lumps what I call sliders in with Gagne’s changeups.Gagne Vertical Break vs. Horizontal Break

I realize the above graph is not presented in a terribly intuitive fashion in terms of what the vertical break, particularly, means. I have some ideas for helping to clarify that, but for now I’ll just present that graph as is.

There is a lot more that can be done with this data, but I’ve found before that if I try to do it all in one fell swoop, I don’t publish anything, so I’ll start with this.

Update: Part 2 of the series on Gagne.

Next Page »