these public use files are the first in my experience to admit possessing yet fail to release a proper data dictionary. the steps to learn about their contents: (1) read the full faers homepage, not too long. (2) download and unzip one of the recent quarterly files by hand, for example 2012 quarter four. (3) read yes read the faqs.doc and readme.doc files included in that microdata file. once you're convinced these have what you need, let the download and import automation do the rest. this new github repository contains two scripts:
download and import.R
- figure out all zipped files containing quarterly microdata for both laers (legacy) and faers
- loop through each available quarter, download and unzip onto the local disk
- import each dollar-sign-delimited text file into an r data.frame object, cleaning up as you go
- save each object as a fresh yet familiar rda file in a convenient pattern within the working directory
year stacks.R
- find each quarterly data file for both laers (legacy) and faers on the local disk and sort them by year
- stack all similar-system files into single-year files that nearly match the fda-published annual statistics. but not exactly. even though the individual quarterly files do match their control counts. can't win 'em all.
for more detail about the fda adverse event reporting system (faers), visit:
- the structured public labeling website, the basis of the data dictionary that they won't share, whatever that means
- the national drug code directory, just in case you need to merge things with other things
- a stackexchange link with the openfda tag for your questions. they respond fast.
notes:
in pursuit of what's hip and stylish, the fda has set up an api where users might query this database for up-to-the-minute case reports. but unless you're setting up a bot to tweet adverse events as they happen or researching something that cannot wait for the quarterly file to be released - like google flu trends - the api seems too sexy for anyone other than right said fred. you probably ought to load the entire data set onto your computer and explore it on your own first.
confidential to sas, spss, stata, and sudaan users: heavy doses of those programs may cause statococcal infection. time to transition to r. :D