During March I did an in-depth study of Jack Cust’s surprising 2007 season. Recently I’ve been wondering why he was struggling so mightily in 2008. I did an update to the study and published the results at The Hardball Times.

Edit:  I posted to THT Live about Cust’s performance over the last couple days.  I don’t intend to imply that I can divine the end to a player’s slump or the beginning of a hot streak.  It’s more of a case of me musing out loud about and trying to learn how the PITCHf/x tools fit into the scouting/performance picture.

Ike Hall has a really good post on data corrections at his new blog.

I’m still in the process of reading through it, but he now has concrete data that confirms what I have believed all along on the basis of the pitcher data I’ve seen, that a uniform correction factor for each park for the whole 2007 season was not adequately addressing the real source(s) of error in the data.

I was interviewed about PITCHf/x by Will Carroll of Baseball Prospectus Radio.  You can listen to the interview on the BPR site.

I posted an analysis of Jose Valverde’s early-season troubles on THT Live yesterday.

Also, Dan Brooks (Jnai) has been doing some very good work with PITCHf/x over at the Sons of Sam Horn discussion board. His PITCHf/x wiki is well worth checking out.

As part of the discussion of Dan’s work at the Book blog, I made a chart of pitch speed vs. spin deflection angle for a typical right-handed pitcher.

Typical RHP speed vs. spin deflection angle

We are like dwarfs sitting on the shoulders of giants. We see more, and things that are more distant, than they did, not because our sight is superior or because we are taller than they, but because they raise us up, and by their great stature add to ours. –John of Salisbury

The two men who inspired me to take up PITCHf/x analysis have both now moved on to work for major league clubs. My hearty congratulations go out to Dan Fox, the new Director of Baseball Systems Development for the Pittsburgh Pirates. The Pirates are getting a good man.

Previously, Joe P. Sheehan took an internship with an undisclosed major league club. I am excited for him and fully expect to see his name in bigger roles in the not-too-distant future.

I quote John of Salisbury not to claim superiority for my work over Dan’s or Joe’s, but to the extent that I have been successful in uncovering new territory, a great deal of credit goes to Joe and Dan for inspiring me and for laying out the groundwork in the nascent PITCHf/x field. I appreciate both of them, and I’m encouraged to see their work being recognized by major league clubs.

I have two new articles up at the Hardball Times.

The first is a short article on THT Live breaking down Francisco Liriano’s April 13 start against the Kansas City Royals.

The second is an article examining the ways in which ball tracking technologies like PITCHf/x are changing the game and what kinds of analysis are possible with this new data. It’s an expansion on my opening day laundry list of ideas that I posted here.

I posted a brief evaluation of the MLBAM pitch classification algorithm on the THT Live blog. So far I am not impressed with the system, but maybe there is hope for some improvement.

Update 4/11: Dan Fox reports that some improvements have been instituted for the MLB classification system this week. I’m in the process of taking a look at some data for a few other pitchers. This new data set includes a few starts from Thursday, April 10, which I believe should be covered under the improved algorithm that incorporates information about the pitches in a pitcher’s repertoire. I’ll report back if and when I learn something from this study.

I posted a short article at the Hardball Times examining Johnny Cueto’s pitching repertoire from his outstanding debut game on April 3.

I just finished an analysis of Oakland A’s outfielder/DH Jack Cust, and it was published at The Hardball Times this morning.

I thought it might be helpful for me to let you know what else I’m working on at the current time, what things I’m considering working on in the future, and to mention problems that I consider generally important for PITCHf/x research to tackle in the near future, whether or not I plan to work on them personally.

I would appreciate input, both on what you consider important for baseball research in general and for what you’d most like to see me do. Maybe out of a discussion, we can jointly develop a set of PITCHf/x Hilbert problems.

What’s on my plate now

  • Updating the catalog of PITCHf/x-related articles. I’ve fallen seriously behind on this in 2008, mainly because it’s becoming too big for me to handle in its current format. Which leads to the second item…
  • Transferring the catalog of PITCHf/x-related articles into a database. This should improve searchability, portability, and timeliness.
  • Implementing an improved Gameday data spider using Wget for 2008 season.
  • Developing and/or consolidating data parsers that I developed last season into a more integrated and efficient whole.
  • Tinkering, as always, with the data from various players trying to learn something about the player or understand something about the PITCHf/x data set. This occasionally turns into an in-depth article on a player.

Things I might do in the future

  • Adding baserunner state information to my PITCHf/x database.
  • Investigating why some pitchers are home-run prone and others are not. I’m particularly interested in why some flyball pitchers are less home-run prone than other.
  • Investigating what we can learn from release point data for pitchers.
  • Attempting to identify hanging breaking balls.
  • Systematizing pitcher information on a broader scale.

Other questions for PITCHf/x research

Evaluating pitching:

  • Integrating information about mechanics and delivery with information from PITCHf/x about release point and trajectory of pitches and consistency of same.
  • Determining the run value of a fastball as a function of speed.
  • Measuring fastball speed, curveball spin, etc., as a function of fatigue. We would need a good measure of fatigue.
  • Improving pitch classification methods and terminology.

Evaluating hitting:

  • Collecting speed-off-bat information, a.k.a. Hit f/x. This is an area that Sportvision is investigating. It could potentially also have very valuable applications for fielding.

Improving data integrity:

  • Data correction for park and weather variations and camera distortion.
  • Identifying and eliminating spurious pitch data.

Evaluating catcher defense:

  • Pitch blocking: Dan Turkenkopf has made a great start at evaluating catchers’ ability to block pitches in the dirt. There’s much more that could be done.
  • Game calling: do some catchers have patterns and preferences for the pitches they call that we can distinguish from the pitcher’s patterns and preferences? Can we use a Tango WOWY approach to determine this?
  • Stolen bases: Do pitch speeds and types affect stolen base success rates? How does this affect particular catchers and pitchers? Does the speed or location of a pitchout affect its chance of success?

At this point, this is not a comprehensive list, but it would be nice if we could develop something like that.

I’m sad to say that I am leaving the Statistically Speaking blog at MVN and excited to let you know that I will be joining the team of writers at The Hardball Times.  I haven’t written any articles yet for THT, but I’ll let you know when I do.

Next Page »