this is event-level american criminal activity microdata, compiled and published by the fbi and then curated by the university of michigan. it's for you. download it. study it. hold it upside-down and sideways and run analyses on it until you pass out. if you spot anything newsworthy, tell the world. it is your data to do whatever you like with. that is remarkable, isn't that remarkable? i've consistently been astounded by the dedication of federal agencies in the united states publishing their microdata for scrappy outside researchers like you and me. but there's one hitch: the public use files do not match what the fbi publishes. cjis_comm@leo.gov at the fbi told me..
The data may be different because the first link is from the FBI UCR Program’s NIBRS publication which is a snapshot in time. For example, the 2012 deadline for data to be included in the CIUS publication would have been in March 2013. The states/agencies had until the end of 2013 to submit additional data and make adjustments before the master closed early in 2014.
..and tomz@umich.edu at the national archive of criminal justice data said..
One possibility for the numbers not tying out exactly is whether the FBI counts all the agencies in the data. For UCR data tables the FBI sometimes only counts agencies that reported for the entire 12 months. I would look to see if your counts are larger than the FBI's, and I'd see if the number of agencies you are using is different from the FBI. Another possibility is that the FBI can update their data at any time, and we are not always made aware of that.
..so when you run a query, you will not reproduce fbi counts precisely. results are close, but not exact. you'll see that the reproduction syntax is imperfect replication. oh, and once you've run the download automation syntax, the monetdb analysis speeds will outrun even the fastest of imaginary crime-fighting superheroes. this new github repository contains two scripts:
download all microdata.R
- create the batch (.bat) file needed to initiate the monet database in the future
- log into the university of michigan's website with the free login info you'll have to obtain beforehand
- download every data file from this study to the local disk
- loop through each dat file in the current working directory, import them into monet with read.sascii.monetdb
- create a well-documented block of code to re-initiate the monetdb server in the future
reproduce fbi tables.R
- initiate the same monetdb server instance, using the same well-documented block of code as above
- create three fbi-produced data tables off of the actual microdata, close but not exactly.
- be amazed. that was dozens of queries, each on millions of records. and it worked on your laptop. wow.
click here to view these two scripts
for more detail about the national incident-based reporting system (nibrs) microdata, visit:
- both the fbi's faq page and also the nacjd's resource guide.
- the related publications page to see what's already been done.
notes:
the preliminary 2013 crime statistics show a major expansion in the united states population covered by departments participating in nibrs (in table one, compared to 2012 and 2011), so before you trend anything, make sure to examine which police agencies in the locality that you are interested in contributed their data to the program. in other words, don't confuse a new municipality reporting crime statistics to the fbi with a spike or dip in the crime rate. right? right.
this is not survey data, so use normal statistical tests (not survey-adjusted ones) like these commands in your monetdb sql code to compute measures of variation like a confidence interval. and remember, for more sql query construction help, try the w3schools tutorial and also just searching for specific commands in my archive.
confidential to sas, spss, stata, and sudaan users: these languages will vanish, like d. b. cooper. time to transition to r. :D