My article at Hardball Times on Danny Herrera’s screwball includes views of his pitch trajectories as seen from the right-handed and left-handed batter’s boxes.

I mentioned in the References section that I did some trigonometry to transform the coordinate system from plate view to batter’s box view.

Here is what I did.

The pitch trajectory is shown as the dotted black line. Any point on the trajectory can be calculated using the initial position, velocity, and acceleration provided in the PITCHf/x data, along with the equations of motion. Only the x-y plane is shown above since no transformation was done to the z axis. The coordinates in the PITCHf/x coordinate space are x and y, shown in black.

The coordinates in the batter’s box view are x’ and y’, shown in red. The y-axis in the batter’s box view runs along a line from the batter’s head to the pitcher’s approximate release point (the average x value of his pitches at y = 55 feet). The x-axis in the batter’s box view is set perpendicular to this new y-axis.

The origin of the batter’s box view is offset 2.8 feet in the x direction from the origin in PITCHf/x coordinate space. I calculated 2.8 feet from the center of the plate as the approximate location of the batter’s head, based on a video frame capture in Marv White’s presentation at the PITCHf/x Summit. I chose not to offset the origin in the y direction for simplicity, although I also believe this does not introduce any significant inaccuracy. The batter’s head is typically within a foot or so of y=0.

First, I calculated the quantity m, the distance to the baseball, shown by the blue line. This distance m = sqrt ( y^2 + ( x + 2.8 ft)^2 ).

Next, I found the value of the angle alpha. The angle alpha = arctan ( 55 ft / ( x0 + 2.8 ft) ).

The angle (alpha – theta) = arctan ( y / ( x + 2.8 ft) ), which allows us to calculate the angle theta.

The angle theta = arctan ( 55 ft / ( x0 + 2.8 ft) ) – arctan ( y / ( x + 2.8 ft) ).

The batter’s box coordinates x’ and y’ can be found from the angle theta and the distance m. The new y’ = m * cos (theta), and the new x’ = m * sin (theta).

I am happy for you to use my method for batter’s view transformation if you provide attribution in the form of my name and/or a link to this website.

I have two new articles up at the Hardball Times.

The first is a short article on THT Live breaking down Francisco Liriano’s April 13 start against the Kansas City Royals.

The second is an article examining the ways in which ball tracking technologies like PITCHf/x are changing the game and what kinds of analysis are possible with this new data. It’s an expansion on my opening day laundry list of ideas that I posted here.

I posted a brief evaluation of the MLBAM pitch classification algorithm on the THT Live blog. So far I am not impressed with the system, but maybe there is hope for some improvement.

Update 4/11: Dan Fox reports that some improvements have been instituted for the MLB classification system this week. I’m in the process of taking a look at some data for a few other pitchers. This new data set includes a few starts from Thursday, April 10, which I believe should be covered under the improved algorithm that incorporates information about the pitches in a pitcher’s repertoire. I’ll report back if and when I learn something from this study.

I’ve posted an introduction and tutorial on various PITCHf/x topics at MVN.

  1. What is PITCHf/x?
  2. How do I get and use the data?
  3. Where can I find resources?
  4. How do I identify pitch types?
  5. How do I interpret graphs?
  6. Is the data reliable?
  7. Where can I go for further discussion and study?

Once again building on pitch identification work I’ve done for a pitcher, here is Part 2 of the series on Joba Chamberlain. It’s not exactly all I hoped, for reasons I’ll get to in a moment, but there are some interesting things to be learned. This is similar to previous work I’ve done for Josh Beckett and Eric Gagne.

First, let’s look at which pitches Chamberlain uses in various ball-strike counts.

Count Fastball Slider Change Curve #Pitches
0-0 71% 24% 1% 4% 79
0-1 58% 26% 3% 13% 38
0-2 41% 59% 0% 0% 17
1-0 88% 12% 0% 0% 34
1-1 65% 32% 3% 0% 31
1-2 39% 61% 0% 0% 23
2-0 89% 11% 0% 0% 9
2-1 65% 29% 0% 6% 17
2-2 21% 68% 0% 11% 19
3-0 100% 0% 0% 0% 2
3-1 100% 0% 0% 0% 3
3-2 50% 50% 0% 0% 10
Ahead 49% 44% 1% 6% 78
Even 62% 33% 2% 4% 129
Behind 79% 20% 0% 1% 75
0 strikes 77% 19% 1% 2% 124
1 strike 63% 28% 2% 7% 89
2 strikes 36% 61% 0% 3% 69
Ball 0-1 65% 30% 1% 4% 222
Ball 2-3 55% 40% 0% 5% 60
All 63% 32% 1% 4% 282

Chamberlain Pitch Mix by Count

Joba Chamberlain definitely relies on his fastball, which is probably not unusual for a power pitcher out of the bullpen, but he throws his slider much more often with two strikes. In a 2-2 count, you can almost expect a slider (68%). I don’t think we have enough data on his use of the curveball to draw conclusions about that. You can compare my data to Josh Kalk’s, although my data set includes Chamberlain’s two divisional series appearances, and Josh’s algorithm classifies all of Chamberlain’s off-speed pitches as sliders, whereas I have identified his curveball and changeup separately.

Next, let’s look at the results split up by pitch type and batter handedness.

Fastball 38 22 14 3 13 7 10 0.350 0.500 61% 92%
Slider 13 3 5 14 3 0 0 0.000 0.000 66% 36%
Changeup 2 0 0 0 0 0 0 0%
Curveball 5 2 0 2 0 0 0 44% 0%
  58 27 19 19 16 7 10 0.304 0.435 60% 69%
Fastball 27 15 18 7 9 4 7 0.308 0.538 66% 82%
Slider 18 8 5 18 4 0 0 0.000 0.000 66% 33%
Changeup 0 0 0 0 1 0 0 0.000 0.000 100% 100%
Curveball 1 1 0 0 0 0 0 50%
  46 24 23 25 14 4 7 0.222 0.389 66% 62%

CS=called strike, SS=swinging strike, IPO=in play (out), IPNO=in play (no out), TB=total bases, BABIP=batting average on balls in play (including home runs), SLGBIP=slugging average on balls in play (including home runs). For Strk% all pitches other than balls are counted as strikes. Con% = (Foul+IPO+IPNO)/(Foul+IPO+IPNO+SS).

The first thing that jumps out is, of course, the results for his slider. Wow! Just wow. In the PITCHf/x games, at least, nobody got a hit off of it, and hardly anybody managed to put it into play or even foul it off. The only real negative would be that it seemed like he had a little trouble throwing his curveball for strikes, but given that he only walked nine men in 27 and 2/3 innings, that doesn’t seem a big concern.

Next let’s look at the strike zone charts showing where Joba Chamberlain locates his pitches against left-handed hitters and right-handed hitters. I’m keeping the same formatting for these charts as I did in the Beckett and Gagne analyses. The strike zone is shown as a box, including one radius of a baseball on each side of the plate, and the top and bottom of the zone are a general average not adjusted per batter in these charts. The location is plotted where the pitch crossed the front of home plate.

Let’s begin with the fastball.

Chamberlain Fastball Results

Chamberlain works mostly on the outer half of the plate with the fastball to lefties, and he’s more in the zone to righties, although he also comes up and in to righties. Batters seem to be able to handle his fastball fairly well, not swinging and missing very often and having pretty good success when they do put the ball in play, similar to what we saw with Josh Beckett’s four-seam fastball. I don’t have a good idea yet how this compares to league-wide numbers for all pitchers’ fastballs or even to a significant number of other hard throwers.

Next, let’s look at the seldom-used curveball and changeup. I’ll present these without comment since there isn’t much data to discuss.

Chamberlain Results for Curveballs

Chamberlain Changeup Results

Finally, let’s move on to what you’ve all been waiting for: the famous Joba slider.

Chamberlain Slider Results

This is where this avenue of inquiry starts to go downhill. After looking at this graph, I wanted to talk about how Chamberlain gets a lot of swings and misses on his slider down and away to righties and down and in to lefties.

But I was bugged by the swinging strike that was recorded nearly at the lefty batter’s foot (x=1.83, z=0.35). Was a hitter so badly fooled by a slider that he swung at one at his shoe top? It’s certainly possible, but if so, I wanted to see it. So I brought up the footage for the game, September 23rd against Toronto, where Chamberlain entered with two on and two out in the 8th inning to face left-handed Adam Lind, trying to preserve a 7-5 Yankees’ lead. Jumping to the end of the story, Chamberlain throws Lind five straight sliders to strike him out and end the inning.

Unfortunately, however, the pitch locations recorded by PITCHf/x for these pitches were mistakenly attached to the wrong pitches in the Gameday XML data.

Chamberlain Lind PITCHf/x at bat

Chamberlain Lind Actual at bat

The first pitch of the at bat was a belt-high slider just inside that Lind swung at and missed, followed by a second pitch in almost the same location, with the same result. Next, Chamberlain threw two sliders at Lind’s feet; the second of these landed in the dirt. Lind laid off both of those pitches to even the count at 2-2. Finally, Chamberlain threw a slider down and in, labeled pitch #5 in the second graph, which Lind swung at and missed for strike three.

The XML pitch location data for this game seems to have missed the fourth pitch (the one in the dirt) altogether and added an extraneous pitch, labeled #3 in the first graph, that did not occur in the pitch sequence to Lind. Then the order of the other pitches is out of whack, too. The pitch labeled #1 should be #5, #2 should be #1, #4 should be #2, and #5 should be #3.

The conclusion is that, no, Chamberlain did not get Adam Lind to swing at slider at his shoe tops. He did get him to swing at a pitch down and in that would have been Ball 3 if he let it go by, and it was an impressive pitching performance by Chamberlain, but unfortunately it calls into question the integrity of our data set.

I don’t have any way to verify the integrity of the rest of the data without watching endless hours of games on That may seem like a worthy endeavor to some, and I can’t argue too strenuously with them, but alas, the rest of my non-baseball life seems to think it has some importance, too.

I don’t intend my notation of this example in any way to disparage the incredible work that MLBAM and Sportvision have done in creating this data set and making it available to us. For free, no less. It’s an incredibly valuable resource, and some errors are to be expected during a season in which the system was being evaluated and debugged.

I just don’t know how prevalent these kinds of errors are and when they might call into question some of my conclusions. I do know that Eric Van spotted a similar error in Josh Beckett’s data from Game 1 of the division series, as detailed in this thread at Sons of Sam Horn, post #88. The PITCHf/x data in question for that game has since been removed from the data set altogether. Eric mentions plotting the human-generated x,y coordinates against the computer-generated PITCHf/x coordinates as a way to spot these errors, but in our case with the Chamberlain-Lind at bat, the human-generated coordinates look screwy to me, too. I haven’t applied Eric’s method to a larger data set, so it may still have merit.

While, we’re on this subject, I may as well put in a plug for Josh Kalk’s new PITCHf/x batter-pitcher matchup tool. You can look at the Chamberlain-Lind matchup there for yourself. It doesn’t tell you anything I didn’t show here, but I wanted to make sure all my readers were aware this great tool was available.

Update: Cory Schwartz from MLBAM addresses the PITCHf/x data error here.

Back in August when I was first getting started with PITCHf/x analysis, I took a quick look at a young and highly-touted rookie who had just broken into the big leagues. Sportvision had not yet brought the PITCHf/x system online at his home park, but we had data from one relief appearance this pitcher made on the road, and I used that to get a quick and dirty read on his pitches.

Oddly enough, that quick analysis has been the source of more search engine hits than any other pitcher analysis on my blog so far. That may have something to do with his team, the New York Yankees. I’m starting to feel like I only write about Red Sox or Yankees around here, and today I’m going to continue pandering to the masses with this update on Joba Chamberlain.

When I looked at Chamberlain’s two-inning appearance in August, he was mainly a fastball-slider pitcher with possibly a couple changeups in that outing, and his fastball was hitting the upper 90’s. With a full season’s data, the basic picture remains the same: upper 90’s fastball, hard-breaking slider. But it looks like he’s relying more on a curveball as an off-speed pitch to lefties, and his changeup has hardly been seen since. In addition, we have enough data to look at usage patterns for his different pitch types and the results he gets from each of them, although that may have to wait until a separate article.

Let’s start by identifying his pitch types. Regular readers will know by now that I like to begin this process by graphing pitch speed versus the direction of the spin axis, which determines the direction the pitch will move due to spin. With some Excel help from Tom Tango, I’m going to try this in a bit different format, one that hopefully makes more intuitive sense to the reader as opposed to the hard-core PITCHf/x researcher. I’m putting the data on a polar plot, showing it from the pitcher’s viewpoint, and graphing the direction of the spin force rather than the direction of the spin axis.

Chamberlain Pitch Speed vs. Spin Force Direction

The backspin on a fastball causes it to “rise”, i.e., drop less than a pitch without backspin. The sidespin on a slider makes it break away from a right-handed hitter, and the topspin on a curveball makes it drop more than normal. Hopefully that is clearer from the above polar plot than from the old way I presented this information, which I’ll show below for the sake of comparison (the angles are different).

Chamberlain Pitch Speed vs. Spin Axis Direction

I’d appreciate feedback on whether the new graphing method is easier to understand or any other comments or suggestions you may have.

Joba Chamberlain’s main pitch is a 95-100 mph fastball, delivered roughly from the 1 o’clock position. His fastball has a little bit of cutting action, moving away from a right-handed hitter more than a typical four-seam fastball, but I wouldn’t say he’s throwing a cutter, as such.

His second pitch is a slider with a lot of break, running 84-89 mph. Some of his sliders look almost like very hard curveballs. He throws the slider more to righties (39% of pitches) than to lefties (26%), but he definitely relies on it to both.

After the two changeups we saw in Joba’s August 10th appearance in Cleveland, the PITCHf/x system only recorded one more, thrown on August 24th to retire Placido Polanco on a fly ball to center field. Those three changeups were thrown around 83 mph. There’s not much else to say about changeups with that small sample.

Upon revisiting Joba Chamberlain, I was surprised to find him using an occasional curveball, mostly to lefties. His curve looks pretty typical, running 77-80 mph.

We can also look at how fast the pitches spin.

Chamberlain Pitch Speed vs. Spin Rate

What’s impressive about some of those fastballs is that they almost do actually rise–the spin force imparted by 3200 revolutions per minute is almost enough to keep a 99-mph pitch from dropping at all due to gravity. In fact, by my calculations, the rise due to spin came within one inch of counteracting gravity on eight of Joba’s fastballs. That seems impressive to me. I’ve hardly looked at every hard thrower in baseball, and I know J.J. Putz generates some similar numbers, but I don’t think it’s very common.

The other thing to note in this graph is the low spin rate of the slider. In truth, the slider spins much faster, but much of the spin is around the direction of travel (like the spin on a nicely-spiraling football) due to how the slider is thrown. We can’t measure that component of the spin, but fortunately, that’s the component of the spin that also has little effect on how the pitch moves. When I talk about spin rate around here, that’s short hand for the x- and z-components of the spin, that portion of the spin which affects the direction the ball will break. I don’t always mention it, but it bears repeating occasionally. The slower measured “spin” of the slider is often one easy way to differentiate it from a curveball.

Finally, let’s look at the movement on the pitches. This graph shows the movement due to spin (the Magnus force) and gravity, from the perspective of the pitcher.

Chamberlain Pitch Movement with Gravity

Joba Chamberlain’s slider really has amazing break and his fastball has a lot of hop. I can see why he is regarded as a special talent.

With that, I’ll sign off for now. Hopefully I can czech in again soon with the next part of this series, and I’m grateful you gave me a piece of your time.

Note: For those of you who are interested in reproducing this sort of analysis for yourself (or finding errors in my math), you can download the Excel spreadsheet that I used.

Update: You can read Part 2 here.

Building on the pitch identification I did for Josh Beckett, I wanted to dig a little deeper into how he used his pitches and what results he got, similar to how I did with Eric Gagne.

First, let’s look at which pitches Beckett threw in various counts:

Count 4-seam 2-seam Cutter Change Curve #Pitches
0-0 48% 18% 1% 10% 22% 408
0-1 29% 26% 3% 11% 31% 197
0-2 33% 24% 2% 5% 36% 107
1-0 43% 18% 1% 16% 23% 160
1-1 34% 24% 3% 9% 31% 156
1-2 38% 20% 2% 3% 36% 161
2-0 52% 22% 0% 7% 20% 46
2-1 38% 30% 0% 16% 16% 74
2-2 33% 28% 2% 2% 35% 106
3-0 71% 29% 0% 0% 0% 14
3-1 65% 24% 9% 0% 3% 34
3-2 55% 24% 0% 7% 13% 67
Ahead 33% 24% 2% 7% 34% 465
Even 43% 21% 2% 9% 26% 670
Behind 48% 23% 1% 11% 17% 395
0 strikes 48% 19% 1% 11% 21% 628
1 strike 35% 26% 3% 10% 27% 461
2 strikes 38% 24% 2% 4% 32% 441
Ball 0-1 40% 21% 2% 10% 28% 1189
Ball 2-3 46% 26% 1% 6% 20% 341
All 41% 22% 2% 9% 26% 1527

Beckett Pitch Mix by Count

We can see that he used his curveball more often when he got ahead of hitters, and he leaned more on his four-seam fastball over his two-seam fastball when he got behind in the count. I should mention that I’m including post-season and All-Star game data, which is probably one reason my numbers differ a little from Josh Kalk’s.

Now, let’s look at results by pitch type. Here I’ve split the data up by handedness of the batter.

4-seam FB 137 71 79 26 27 16 30 0.372 0.698 62% 82%
2-seam FB 30 20 22 8 21 14 19 0.400 0.543 74% 88%
Cutter 4 1 5 2 1 2 2 0.667 0.667 73% 80%
Changeup 30 5 14 15 13 8 10 0.381 0.476 65% 70%
Curveball 59 45 16 19 16 4 6 0.200 0.300 63% 65%
  260 142 136 70 78 44 67 0.361 0.549 64% 79%

4-seam FB 85 57 60 18 36 16 29 0.308 0.558 69% 86%
2-seam FB 69 36 51 10 41 15 19 0.268 0.339 69% 91%
Cutter 4 1 1 3 1 1 1 0.500 0.500 64% 50%
Changeup 17 12 7 5 5 4 5 0.444 0.556 66% 76%
Curveball 98 64 23 29 22 6 12 0.214 0.429 60% 64%
  273 170 142 65 105 42 66 0.286 0.449 66% 82%

CS=called strike, SS=swinging strike, IPO=in play (out), IPNO=in play (no out), TB=total bases, BABIP=batting average on balls in play (including home runs), SLGBIP=slugging average on balls in play (including home runs). For Strk% all pitches other than balls are counted as strikes. Con% = (Foul+IPO+IPNO)/(Foul+IPO+IPNO+SS).

Next are strike zone charts showing where he locates his pitches against left-handed hitters and right-handed hitters. I’m keeping the same formatting for these charts as I did in the Gagne analysis, but let me know if you have ideas for how I can improve them. The graphics are a little small, but I thought it was more important to contrast the general patterns of lefty versus righty than to see the exact result for a specific pitch.

The strike zone is shown as a box, including one radius of a baseball on each side of the plate, and the top and bottom of the zone are a general average not adjusted per batter in these charts. The location is plotted where the pitch crossed the front of home plate.

Let’s start with the fastballs. First the four-seamer. (As I mentioned in my previous analysis of Beckett, the line between the four-seamer and two-seamer is a hazy one; although I think my distinction is generally accurate, it is unlikely to be accurate for every specific pitch.)

Beckett 4-seam Fastball Strike Zone Chart

Beckett likes to work the 4-seamer away from lefties, and it looks like he gets a lot of foul balls at the edge or just off the edge of the plate. He also gets a lot of balls, mostly outside it looks, and other than the curveball it’s his pitch that gets the least strikes at 62%. Overall, lefties hit the pitch pretty well–a .372 batting average when they put it in play, and with plenty of power. I wish I knew how that compared to other pitchers’ fastballs, but I don’t have those numbers. Clearly, context is important for numbers like these.

Against righties I don’t see a clear inside/outside preference, although he seems to work up in the zone more than down. He’s also more effective at getting strikes with the pitch against righties (69%).

Moving on to the two-seamer…

Beckett 2-seam Fastball Strike Zone Chart

The first thing that jumps out is that he’s almost twice as likely to use the 2-seamer against righties than lefties. Against lefties, he’s in the zone with it a lot, and it gets hit fairly hard. Against righties, it looks to be his most effective pitch, generating a lot of ground balls when he gets it on the inner half of the plate. Against righties, he got 27 ground outs with his two-seamer compared to 14 outs in the air (pop outs, line outs, and fly outs). He also gave up 7 ground ball hits and 8 hits in the air from his two-seamer against righties. Again, I don’t know if those numbers are significant or how they compare to other pitchers.

Beckett’s least-used pitch is the cutter, so the graphs for it are not terribly interesting, but I’ll show them here.

Beckett Cut Fastball Strike Zone Chart

It looks like he mostly works the cut fastball down and in to lefties and up in the zone to righties, but it’s hard to find any meaningful trends in 26 pitches. He struck out Jay Gibbons on a cutter down and in, and…well, I don’t really have anything more to say about the cutter.

Now for a change of pace…

Beckett Changeup Strike Zone Chart

It’s obvious he likes to keep the changeup down and away from lefties, and he gets a lot of swings and misses that way, particularly when he keeps it down. Against righties, he keeps the ball down but works both sides of the plate. He gets quite a few called strikes on the outer half of the plate.

Finally, we come to Uncle Charlie, Beckett’s other favorite pitch and probably his most effective.

Beckett Curveball Strike Zone Chart

Against lefties, Beckett gets a lot of called strikes across the middle of the zone. He comes down and in a lot, and gets a fair number of swinging strikes when he keeps the curve low to lefties. I’m not sure what to think about the curveballs up and away. I thought those might be hanging breaking balls, but I don’t notice anything unusual when I look at how they moved relative to other curveballs. Maybe he was just hoping to drop those pitches into the top of the strike zone since the location data I’m graphing here was measured at the front of the plate.

Against righties, he’s out of the zone a little more, either down and away or up and in. Again, he gets a lot of called strikes in the zone and swinging strikes when he keeps the curveball down–it’s his most effective pitch for missing bats. Even when hitters put the curveball in play, they don’t have much success–a .208 batting average and a .375 slugging percentage.

Next Page »