We have just pushed up the source code for version 0.7.0 of the Chadwick tools. You can get it from https://github.com/chadwickbureau/chadwick/releases/tag/v0.7.0.
The series 0.7.x is a development series of releases to add a few new features. In this release, we debut a new command-line tool, cwdaily. This tool generates game-by-game statistics for players using the same comma-separated conventions as the traditional tools like cwevent and cwgame. This fills a gap in the toolset, because although it’s possible to extract most of this information using the cwbox tool, to accomplish this requires post-processing of the XML files output by cwbox. That means extra coding, and also makes the process much slower.
The cwdaily tool is fast: on my MacBook Pro generating the day-by-days for the 2017 season from Retrosheet takes just under 2 seconds. This therefore also makes it very suitable for generating stats totals for a season.
As with all the tools, cwdaily works on both the Retrosheet event files as well as the boxscore files, so cwdaily is also a useful tool for extracting the information held in the boxscore files.
As of this writing this is a source-code only release, as we don’t at the moment have a cross-compilation environment set up that lets us target building Windows binaries. (We hope to deal with this soon!)
We have just tagged the first 2017 release (version 2017.1) of Baseball Databank.
You can download the CSV files (in either ZIP or .tar.gz format) at https://github.com/chadwickbureau/baseballdatabank/releases/tag/v2017.1
We’d like to encourage downstream users of the data to let us know about their projects, so we can include links on the webpages and social media!
We have several bits of news on releases to share.
Chadwick 0.6.5 released. We have updated the Chadwick processing tools to support the December Retrosheet release. This is a minor release primarily to accommodate a few new modifier flags. Visit the Chadwick site for download links.
Retrosheet file repository updated. We have updated our git repository of all Retrosheet files with the new December release. You can use this repository to fetch all of the Retrosheet files in one swoop. It also allows you to navigate through the history of the last few years’ worth of Retrosheet releases, so you can see the changes made from release to release. (You’ll see just how busy the Retrosheet team are between each release!)
Retrosheet splits repository updated. This summer we launched a repository of data derived from Retrosheet files. This contains day-by-day records for players and teams, including both the boxscore files as well as the play-by-play files. It also includes, from 1974 forward, various situational splits and batter-pitcher head-to-head records, that can only be obtained from the play-by-play files (and can’t be derived from the day-by-days).
Thanks as always to the Retrosheet crew for making all of this possible.
Some of you may already make use of our git repository containing all of the files Retrosheet releases (along with any announced patches in between official releases), at http://github.com/chadwickbureau/retrosheet.
Today we have added a new repository which provides another view on this data: retrosplits (http://github.com/chadwickbureau/retrosplits).
This repository contains a number of reports aggregating performance at various levels. For many uses, actually needing to dig down to play-level data itself isn’t needed, if all one wants are various summary totals.
Currently the repository has two directories:
- daybyday, which has game-by-game statistics for players and teams. This incorporates all games Retrosheet has published, in either full play-by-play format or in the boxscore file format.
- splits, which has, as the name suggests, compilations of splits which can only be obtained from play-level data. These include:
- Batter and pitcher platoon splits (vsL/vsR)
- Batter and pitcher performance by baserunner situation
- Batter performance by defensive position
- Head-to-head performance by batter-pitcher matchup.
We are making this available under the Open Database License in the hope it will be useful. Certainly, unless one is doing very detailed worth with data at the play or pitch level, these files may save time and faff with installing and running the processing tools (whether DiamondWare or Chadwick).
Of course, we cannot close without once again thanking Retrosheet for making the raw data available, without which these files would not be possible.
A new release of the Chadwick Persons Register is now available, just in time for MLB’s Opening Day.
Not too many major new goodies this time around; just the steady march of revised and expanded data.
A new release of the Chadwick Persons Register is now available.
In addition to the usual additions and demographic updates for historical players, this release confirms the Retrosheet IDs issues for new debutants in 2014.
We expect there will likely be one more iteration of the public register before or right around Opening Day.
We have released version 0.6.4 of the Chadwick tools for manipulating play-by-play and game-level data.
This release adds support for the new umpire and manager review flags (/UREV and /MREV) which appear in the 2014 Retrosheet release, as well as improving support for courtesy runners and player re-entry.
As usual, full source code is available as well as pre-built binaries for Windows users.
With the World Series closing out the (North American summer) seasons, we have just posted a new version of the Chadwick Persons Register.
This includes all players to appear in North American affiliated leagues, North American independent leagues, NPB, and the KBO in 2014, as well as identifier cross-references where known.
It also includes provisional baseball-databank IDs for players who made their MLB debuts in 2014. We will post an update over the off-season once Retrosheet identifiers have been confirmed for the debutants.
As always, enjoy!