At Major League Baseball’s Gameday data website, the PITCHf/x data is included in the inning and pbp/pitchers XML files. What follows is an explanation of the attributes of the pitch element within these XML files.
At stadiums without the PITCHf/x system installed, the pitch element includes only five attributes:
- des: a brief text description of the result of the pitch: Ball; Ball In Dirt; Called Strike; Foul; Foul (Runner Going); Foul Tip; Hit by Pitch; In play, no out; In play, out(s); In play, run(s); Intent Ball; Pitchout; Swinging Strike; Swinging Strike (Blocked).
- id: a unique identification number per pitch within a game. The numbers increment by one for each pitch but are not consecutive between at bats.
- type: a one-letter abbreviation for the result of the pitch: B, ball; S, strike (including fouls); X, in play.
- x, y: the horizontal and vertical location of the pitch as it crossed home plate as input by the Gameday stringer using the old Gameday coordinate system. I’m not sure what units are used or where the origin is located. Note that the y dimension in the old coordinate system is now called the z dimension in the new PITCHf/x coordinate system detailed below.
Stadiums with the PITCHf/x camera system have an additional twenty attributes recorded in the pitch element:
- start_speed: the pitch speed, in miles per hour and in three dimensions, measured at the initial point, y0. Of the two speeds, this one is closer to the speed measured by a radar gun and what we are familiar with for a pitcher’s “velocity” .
- end_speed: the pitch speed measured as it crossed the front of home plate.
- sz_top: the distance in feet from the ground to the top of the current batter’s rulebook strike zone as measured from the video by the PITCHf/x operator. The operator sets a line at the batter’s belt as he settles into the hitting position, and the PITCHf/x software adds four inches up for the top of the zone.
- sz_bot: the distance in feet from the ground to the bottom of the current batter’s rulebook strike zone. The PITCHf/x operator sets a line at the hollow of the knee for the bottom of the zone.
- pfx_x: the horizontal movement, in inches, of the pitch between the release point and home plate, as compared to a theoretical pitch thrown at the same speed with no spin-induced movement. This parameter is measured at y=40 feet regardless of the y0 value.
- pfx_z: the vertical movement, in inches, of the pitch between the release point and home plate, as compared to a theoretical pitch thrown at the same speed with no spin-induced movement. This parameter is measured at y=40 feet regardless of the y0 value.
- px: the left/right distance, in feet, of the pitch from the middle of the plate as it crossed home plate. The PITCHf/x coordinate system is oriented to the catcher’s/umpire’s perspective, with distances to the right being positive and to the left being negative.
- pz: the height of the pitch in feet as it crossed the front of home plate.
- x0: the left/right distance, in feet, of the pitch, measured at the initial point.
- y0: the distance in feet from home plate where the PITCHf/x system is set to measure the initial parameters. This parameter has been variously set at 40, 50, or 55 feet (and in a few instances 45 feet) from the plate at different times throughout the 2007 season as Sportvision experiments with optimal settings for the PITCHf/x measurements. Sportvision settled on 50 feet in the second half of 2007, and this value of y0=50 feet has been used since. Changes in this parameter impact the values of all other parameters measured at the release point, such as start_speed.
- z0: the height, in feet, of the pitch, measured at the initial point.
- vx0, vy0, vz0: the velocity of the pitch, in feet per second, in three dimensions, measured at the initial point.
- ax, ay, az: the acceleration of the pitch, in feet per second per second, in three dimensions, measured at the initial point.
- break_y: the distance in feet from home plate to the point in the pitch trajectory where the pitch achieved its greatest deviation from the straight line path between the release point and the front of home plate.
- break_angle: the angle, in degrees, from vertical to the straight line path from the release point to where the pitch crossed the front of home plate, as seen from the catcher’s/umpire’s perspective.
- break_length: the measurement of the greatest distance, in inches, between the trajectory of the pitch at any point between the release point and the front of home plate, and the straight line path from the release point and the front of home plate, per the MLB Gameday team. John Walsh’s article “In Search of the Sinker” has a good illustration of this parameter.
Three new fields were added to the pitch element for 2008:
- sv_id: a date/time stamp of when the PITCHf/x tracking system first detected the pitch in the air, it is in the format YYMMDD_hhmmss.
- pitch_type: the most probable pitch type according to a neural net classification algorithm developed by Ross Paul of MLBAM.
- type_confidence: the value of the weight at the classification algorithm’s output node corresponding to the most probable pitch type, this value is multiplied by a factor of 1.5 if the pitch is known by MLBAM to be part of the pitcher’s repertoire.
Resources for this glossary included the following:
- Comments, including those by MLB.com’s Director of Stats Cory Schwartz at Tom Tango’s blog THE BOOK.
- The MLB Gameday blog.
- The article “How to Use MLB Gameday Data” by Anthony at Friar Watch.
- Comments to Joe P. Sheehan’s article “Enhanced Gameday”.
His freshman physics lectures on the Physics of Baseball at the University of Illinois is also an excellent primer to understanding the calculations surrounding baseball trajectories.
If you need to convert MLB.com’s player ID’s into names or Lahman database player ID’s, you can consult my list of player ID’s.
Another common question is about the meaning of the BRK and PFX numbers reported in the Gameday application. Here’s what I wrote about the subject on another website:
There are three main forces acting on a spinning baseball: gravity, drag, and the spin force (also called the Magnus or lift force).
The drag force mainly acts to slow a pitch down, it doesn’t have much effect on the movement/break of a pitch, except for very, very slowly spinning pitches (i.e., knuckleballs).
The force of gravity is the same on all pitches, but it has a greater effect on the movement of slow pitches because it has longer to act on them before they reach the plate. Curveballs and changeups drop more due to gravity than fastballs do because they are slower pitches.
Finally, the spin force acts differently on fastballs and curveballs, as the Gameday folks described. Because a fastball is thrown with backspin, the spin force pushes the ball up, counteracting to some extent the force of gravity that is pulling the ball down. This makes the fastball trajectory straighter. Because a curveball is thrown with topspin, the spin force pushes the ball down, reinforcing gravity which is also pushing it down. This makes the curveball drop even more.
Thus, the curveball trajectory has a big bend and the fastball trajectory is relatively straight. The amount of bend in the trajectory is what is being measured by the BRK parameter on Gameday.
The amount of deflection by the spin force is what is being measured by the PFX parameter on Gameday. This PFX deflection is mostly upward for a fastball, meaning that it counteracts roughly 10 or so inches of the drop due to gravity, and the PFX deflection is mostly downward for a curveball, meaning that it adds an additional 6 or so inches of drop in addition to that from gravity.
Additionally, here is a diagram, adapted from John Walsh, that illustrates the break parameters: