sabermetrics


As he does every year, Tom Tango is compiling the Fans’ Scouting Report. He is seeking help from baseball fans to rate the defensive abilities of the players they have watched this season.

Baseball’s fans are very perceptive. Take a large group of them, and they can pick out the final standings with the best of them. They can forecast the performance of players as well as those guys with rather sophisticated forecasting engines. Bill James, in one of his later Abstracts, had the fans vote in for the ranking of the best to worst players by position. And they did a darn good job.

There is an enormous amount of untapped knowledge here. There are 70 million fans at MLB parks every year, and a whole lot more watching the games on television. When I was a teenager, I had no problem picking out Tim Wallach as a great fielding 3B, a few years before MLB coaches did so. And, judging by the quantity of non-stop standing ovations Wallach received, I wasn’t the only one in Montreal whose eyes did not deceive him. Rondel White, Marquis Grissom, Larry Walker, Andre Dawson, Hubie Brooks, Ellis Valentine. We don’t need stats to tell us which of these does not belong.

What I would like to do now is tap that pool of talent. I want you to tell me what your eyes see. I want you to tell me how good or bad a fielder is. Go down, and start selecting the team(s) that you watch all the time. For any player that you’ve seen play in at least 10 games in 2009, I want you to judge his performance in 7 specific fielding categories.

If you’ve watched a lot of baseball in 2009, or at least enough to meet the guidelines, please participate in compiling this valuable resource.

Advertisements

My article at Hardball Times on Danny Herrera’s screwball includes views of his pitch trajectories as seen from the right-handed and left-handed batter’s boxes.

I mentioned in the References section that I did some trigonometry to transform the coordinate system from plate view to batter’s box view.

Here is what I did.

The pitch trajectory is shown as the dotted black line. Any point on the trajectory can be calculated using the initial position, velocity, and acceleration provided in the PITCHf/x data, along with the equations of motion. Only the x-y plane is shown above since no transformation was done to the z axis. The coordinates in the PITCHf/x coordinate space are x and y, shown in black.

The coordinates in the batter’s box view are x’ and y’, shown in red. The y-axis in the batter’s box view runs along a line from the batter’s head to the pitcher’s approximate release point (the average x value of his pitches at y = 55 feet). The x-axis in the batter’s box view is set perpendicular to this new y-axis.

The origin of the batter’s box view is offset 2.8 feet in the x direction from the origin in PITCHf/x coordinate space. I calculated 2.8 feet from the center of the plate as the approximate location of the batter’s head, based on a video frame capture in Marv White’s presentation at the PITCHf/x Summit. I chose not to offset the origin in the y direction for simplicity, although I also believe this does not introduce any significant inaccuracy. The batter’s head is typically within a foot or so of y=0.

First, I calculated the quantity m, the distance to the baseball, shown by the blue line. This distance m = sqrt ( y^2 + ( x + 2.8 ft)^2 ).

Next, I found the value of the angle alpha. The angle alpha = arctan ( 55 ft / ( x0 + 2.8 ft) ).

The angle (alpha – theta) = arctan ( y / ( x + 2.8 ft) ), which allows us to calculate the angle theta.

The angle theta = arctan ( 55 ft / ( x0 + 2.8 ft) ) – arctan ( y / ( x + 2.8 ft) ).

The batter’s box coordinates x’ and y’ can be found from the angle theta and the distance m. The new y’ = m * cos (theta), and the new x’ = m * sin (theta).

I am happy for you to use my method for batter’s view transformation if you provide attribution in the form of my name and/or a link to this website.

I attended the First Annual PITCHf/x Summit hosted by Sportvision and MLBAM in San Francisco May 10-11.  My report on the summit is up at The Hardball Times.  You can also get a couple other viewpoints from Ike Hall and Harry Pavlidis as well as the discussion thread at the Book blog.

I apologize to Harry for inadvertently leaving mention of his presentation out of my report at THT.  Harry had a good report on all the amazing work that is being done by amateur PITCHf/x analysts like us.

I have two new articles up at the Hardball Times.

The first is a short article on THT Live breaking down Francisco Liriano’s April 13 start against the Kansas City Royals.

The second is an article examining the ways in which ball tracking technologies like PITCHf/x are changing the game and what kinds of analysis are possible with this new data. It’s an expansion on my opening day laundry list of ideas that I posted here.

I just finished an analysis of Oakland A’s outfielder/DH Jack Cust, and it was published at The Hardball Times this morning.

I thought it might be helpful for me to let you know what else I’m working on at the current time, what things I’m considering working on in the future, and to mention problems that I consider generally important for PITCHf/x research to tackle in the near future, whether or not I plan to work on them personally.

I would appreciate input, both on what you consider important for baseball research in general and for what you’d most like to see me do. Maybe out of a discussion, we can jointly develop a set of PITCHf/x Hilbert problems.

What’s on my plate now

  • Updating the catalog of PITCHf/x-related articles. I’ve fallen seriously behind on this in 2008, mainly because it’s becoming too big for me to handle in its current format. Which leads to the second item…
  • Transferring the catalog of PITCHf/x-related articles into a database. This should improve searchability, portability, and timeliness.
  • Implementing an improved Gameday data spider using Wget for 2008 season.
  • Developing and/or consolidating data parsers that I developed last season into a more integrated and efficient whole.
  • Tinkering, as always, with the data from various players trying to learn something about the player or understand something about the PITCHf/x data set. This occasionally turns into an in-depth article on a player.

Things I might do in the future

  • Adding baserunner state information to my PITCHf/x database.
  • Investigating why some pitchers are home-run prone and others are not. I’m particularly interested in why some flyball pitchers are less home-run prone than other.
  • Investigating what we can learn from release point data for pitchers.
  • Attempting to identify hanging breaking balls.
  • Systematizing pitcher information on a broader scale.

Other questions for PITCHf/x research

Evaluating pitching:

  • Integrating information about mechanics and delivery with information from PITCHf/x about release point and trajectory of pitches and consistency of same.
  • Determining the run value of a fastball as a function of speed.
  • Measuring fastball speed, curveball spin, etc., as a function of fatigue. We would need a good measure of fatigue.
  • Improving pitch classification methods and terminology.

Evaluating hitting:

  • Collecting speed-off-bat information, a.k.a. Hit f/x. This is an area that Sportvision is investigating. It could potentially also have very valuable applications for fielding.

Improving data integrity:

  • Data correction for park and weather variations and camera distortion.
  • Identifying and eliminating spurious pitch data.

Evaluating catcher defense:

  • Pitch blocking: Dan Turkenkopf has made a great start at evaluating catchers’ ability to block pitches in the dirt. There’s much more that could be done.
  • Game calling: do some catchers have patterns and preferences for the pitches they call that we can distinguish from the pitcher’s patterns and preferences? Can we use a Tango WOWY approach to determine this?
  • Stolen bases: Do pitch speeds and types affect stolen base success rates? How does this affect particular catchers and pitchers? Does the speed or location of a pitchout affect its chance of success?

At this point, this is not a comprehensive list, but it would be nice if we could develop something like that.