I just finished an analysis of Oakland A’s outfielder/DH Jack Cust, and it was published at The Hardball Times this morning.
I thought it might be helpful for me to let you know what else I’m working on at the current time, what things I’m considering working on in the future, and to mention problems that I consider generally important for PITCHf/x research to tackle in the near future, whether or not I plan to work on them personally.
I would appreciate input, both on what you consider important for baseball research in general and for what you’d most like to see me do. Maybe out of a discussion, we can jointly develop a set of PITCHf/x Hilbert problems.
What’s on my plate now
- Updating the catalog of PITCHf/x-related articles. I’ve fallen seriously behind on this in 2008, mainly because it’s becoming too big for me to handle in its current format. Which leads to the second item…
- Transferring the catalog of PITCHf/x-related articles into a database. This should improve searchability, portability, and timeliness.
- Implementing an improved Gameday data spider using Wget for 2008 season.
- Developing and/or consolidating data parsers that I developed last season into a more integrated and efficient whole.
- Tinkering, as always, with the data from various players trying to learn something about the player or understand something about the PITCHf/x data set. This occasionally turns into an in-depth article on a player.
Things I might do in the future
- Adding baserunner state information to my PITCHf/x database.
- Investigating why some pitchers are home-run prone and others are not. I’m particularly interested in why some flyball pitchers are less home-run prone than other.
- Investigating what we can learn from release point data for pitchers.
- Attempting to identify hanging breaking balls.
- Systematizing pitcher information on a broader scale.
Other questions for PITCHf/x research
- Integrating information about mechanics and delivery with information from PITCHf/x about release point and trajectory of pitches and consistency of same.
- Determining the run value of a fastball as a function of speed.
- Measuring fastball speed, curveball spin, etc., as a function of fatigue. We would need a good measure of fatigue.
- Improving pitch classification methods and terminology.
- Collecting speed-off-bat information, a.k.a. Hit f/x. This is an area that Sportvision is investigating. It could potentially also have very valuable applications for fielding.
Improving data integrity:
- Data correction for park and weather variations and camera distortion.
- Identifying and eliminating spurious pitch data.
Evaluating catcher defense:
- Pitch blocking: Dan Turkenkopf has made a great start at evaluating catchers’ ability to block pitches in the dirt. There’s much more that could be done.
- Game calling: do some catchers have patterns and preferences for the pitches they call that we can distinguish from the pitcher’s patterns and preferences? Can we use a Tango WOWY approach to determine this?
- Stolen bases: Do pitch speeds and types affect stolen base success rates? How does this affect particular catchers and pitchers? Does the speed or location of a pitchout affect its chance of success?
At this point, this is not a comprehensive list, but it would be nice if we could develop something like that.