Shortly after each NHL season a ZIP file of all stats from the previous season plus all other NHL seasons shows up on the Hockey Databank Yahoo group. The ZIP file contains CSV stats files with just about any hockey stat you can think of. These files are a great gift to those of us who are stats junkies but there isn't an easy way to query them. With that in mind I decided to create a database to store the stats. I wrote a quick Clojure app to load the CSV files into a PostgreSQL database. When I was about halfway through the loading process I realized it wouldn't any additional Clojure code (other than a connection information) to create and load MySQL and Sqlite databases, so I did just that.
The Database Schema
The database was created so that each table maps to a CSV file. The tables have the same columns as the CSV files except for the master table. I added an id to the tableas a primary key. The create_db.sql files are available for all three database flavors here: MySQL, PostgreSQL, and SQLite. These files will give you an empty database with all the necessary tables. If you want the pre-loaded database you can download one of the database dumps below.
The Database Dumps
You can download the database backup files here:
If you aren't familiar with how to restore the backups for either the MySQL or PostgreSQL databases I will give you a quick 'how to' restore from the command line. The example steps work on my mac and should also work from a linux command line. The SQLite file is the actual database file so after 'un-tarring' or unzipping the file you are ready to go.
Restoring the MySQL Dump
1. Create the database by running:
mysql -u <username> -p<password>
At the mysql prompt enter:
create database hockeydb;
Exit out of mysql and return to the prompt.
2. 'Un-tar' the file by running:
tar zxvf hockeydb-6-23-12-mysql.tgz
3. Restore the database by running the following command:
mysql -u <username> -p<password> hockeydb < hockeydb.sql
Thats all there is to it. You should now have a MySQL database loaded with the dat from the Hockey Data Bank files.
Restoring the PostgreSQL Dump
1. Create the database by running:
createdb hockeydb
2. 'Un-tar' the file by running:
tar zxvf hockeydb-6-23-12-postgres.tgz
3. Restore the database by running the following command:
psql hockeydb < hockeydb.backup
Thats all there is to it. You should now have a PostgreSQL database loaded with the dat from the Hockey Data Bank files.
Summary
Hopefully these databases will help others work with the Hockey Databank data. I would like to thank the Hockey Databank Group member dsreyn and everyone else who helped to put this data together, without their hard work these databases wouldn't be possible.
My future plans for this data is to create data access libraries for Clojure, C#, and Ruby. When that will happen I am not sure. When the libraries do come available I will notify everyone through this blog.
If you have any questions, concerns or suggestions please feel free to leave a comment!