Category Archives: data

Baseball Databank version 2017.1 released

We have just tagged the first 2017 release (version 2017.1) of Baseball Databank.

You can download the CSV files (in either ZIP or .tar.gz format) at

We’d like to encourage downstream users of the data to let us know about their projects, so we can include links on the webpages and social media!


Chadwick 0.6.5 and Retrosheet repository updates

We have several bits of news on releases to share.

Chadwick 0.6.5 released.  We have updated the Chadwick processing tools to support the December Retrosheet release.  This is a minor release primarily to accommodate a few new modifier flags.  Visit the Chadwick site for download links.

Retrosheet file repository updated.  We have updated our git repository of all Retrosheet files with the new December release.  You can use this repository to fetch all of the Retrosheet files in one swoop.  It also allows you to navigate through the history of the last few years’ worth of Retrosheet releases, so you can see the changes made from release to release. (You’ll see just how busy the Retrosheet team are between each release!)

Retrosheet splits repository updated.  This summer we launched a repository of data derived from Retrosheet files.  This contains day-by-day records for players and teams, including both the boxscore files as well as the play-by-play files.  It also includes, from 1974 forward, various situational splits and batter-pitcher head-to-head records, that can only be obtained from the play-by-play files (and can’t be derived from the day-by-days).

Thanks as always to the Retrosheet crew for making all of this possible.



New repository: Day-by-days and splits from Retrosheet

Some of you may already make use of our git repository containing all of the files Retrosheet releases (along with any announced patches in between official releases), at

Today we have added a new repository which provides another view on this data: retrosplits (

This repository contains a number of reports aggregating performance at various levels. For many uses, actually needing to dig down to play-level data itself isn’t needed, if all one wants are various summary totals.

Currently the repository has two directories:

  • daybyday, which has game-by-game statistics for players and teams. This incorporates all games Retrosheet has published, in either full play-by-play format or in the boxscore file format.
  • splits, which has, as the name suggests, compilations of splits which can only be obtained from play-level data.  These include:
    • Batter and pitcher platoon splits (vsL/vsR)
    • Batter and pitcher performance by baserunner situation
    • Batter performance by defensive position
    • Head-to-head performance by batter-pitcher matchup.

We are making this available under the Open Database License in the hope it will be useful. Certainly, unless one is doing very detailed worth with data at the play or pitch level, these files may save time and faff with installing and running the processing tools (whether DiamondWare or Chadwick).

Of course, we cannot close without once again thanking Retrosheet for making the raw data available, without which these files would not be possible.

Chadwick Persons Register, release 2015-04-05

A new release of the Chadwick Persons Register is now available, just in time for MLB’s Opening Day.

Not too many major new goodies this time around; just the steady march of revised and expanded data.

Chadwick Persons Register, release 2015-01-30

A new release of the Chadwick Persons Register is now available.

In addition to the usual additions and demographic updates for historical players, this release confirms the Retrosheet IDs issues for new debutants in 2014.

We expect there will likely be one more iteration of the public register before or right around Opening Day.

Chadwick Persons Register updated 2014-11-03

With the World Series closing out the (North American summer) seasons, we have just posted a new version of the Chadwick Persons Register.

This includes all players to appear in North American affiliated leagues, North American independent leagues, NPB, and the KBO in 2014, as well as identifier cross-references where known.

It also includes provisional baseball-databank IDs for players who made their MLB debuts in 2014. We will post an update over the off-season once Retrosheet identifiers have been confirmed for the debutants.

As always, enjoy!