New repository: Day-by-days and splits from Retrosheet

Some of you may already make use of our git repository containing all of the files Retrosheet releases (along with any announced patches in between official releases), at http://github.com/chadwickbureau/retrosheet.

Today we have added a new repository which provides another view on this data: retrosplits (http://github.com/chadwickbureau/retrosplits).

This repository contains a number of reports aggregating performance at various levels. For many uses, actually needing to dig down to play-level data itself isn’t needed, if all one wants are various summary totals.

Currently the repository has two directories:

  • daybyday, which has game-by-game statistics for players and teams. This incorporates all games Retrosheet has published, in either full play-by-play format or in the boxscore file format.
  • splits, which has, as the name suggests, compilations of splits which can only be obtained from play-level data.  These include:
    • Batter and pitcher platoon splits (vsL/vsR)
    • Batter and pitcher performance by baserunner situation
    • Batter performance by defensive position
    • Head-to-head performance by batter-pitcher matchup.

We are making this available under the Open Database License in the hope it will be useful. Certainly, unless one is doing very detailed worth with data at the play or pitch level, these files may save time and faff with installing and running the processing tools (whether DiamondWare or Chadwick).

Of course, we cannot close without once again thanking Retrosheet for making the raw data available, without which these files would not be possible.