Saturday, April 9, 2011

Programming Note

You may have noticed that I haven't posted since April 1. That's part laziness, but also part studying for AP exams and doing other assorted schoolwork, working on future posts, and working on making a shot database (plus reasons I'd prefer to keep private).

At any rate, the issue I ran into in taking the NHL play by play is that the HTML tables don't present a nice format for me. I don't know how to code in Excel, so I've been trying to code in Java, and downloading the PxP as .txt and then looking at what I get...not pretty. Luckily, I've recently made some headway and I should be done with that part soon (I haven't yet attempted to tackle shifts and time on ice, though...planning on that after exams). I'm planning to get raw Corsi numbers and WOWY (though without time on ice numbers).

I'm perfectly willing to share, so just send me an email if you want code, or if you have any suggestions for what methods I should implement. Do note the PxP files are about 1MB each (a season is over 1GB).

