the gregorians celebrate the new year in january, the chinese in february, but the federal financial institutions examination council (ffiec) drops their ball in times square with a data release every september. prospero ano everybody, because the latest hmda (pronounced hum-duh) microdata have arrived. clocking in between ten- and thirty-five million records per year, this looks like a job for monetdb. it's sexy, it's free, it's the perfect companion for big public data. make learning a new language your resolution. this new github repository contains two scripts:
download all microdata.R
- initiate a monetdb server on your local machine to house every table and every year of hmda
- download and, without taking a breath, import every file into monetdb
- merge the loan application record table with the institutional records table, for future easy access
- construct some race and ethnicity variables to match those published by ffiec
replicate ffiec publications.R
- open up and then connect to a monetdb server instance, like a champ
- present a few simple sql queries so you can take it from here
- reproduce a few sets of numbers published by the united states government
click here to view these two scripts
for more detail about hmda, visit:
- the frequently asked questions page, might help you decide which variables to use
- the left sidebar of the hmda homepage, for a few online query tools
- the main codebook. pure and simple.
- http://en.wikipedia.org/wiki/Home_Mortgage_Disclosure_Act
notes:
if your research requires anything prior to 2006, you might need to order the older data sets from the national archives. i believe they'll mail it on some 8-trax.
and thanx to max over at furman for both technical and moral support.
confidential to sas, spss, stata, sudaan users: get ready for your semicolonoscopy. time to transition to r. :D