to quickly understand exactly what's possible with this microdata, look at hrsa's nationwide statistics table. alright, you got me, this isn't a complex sample survey. but hey, merge it onto other survey data sets. what's that? other data sets don't have census tract- or minor civil division-level identifiers? perhaps not in their public use files, but visit the data centers and you might find the geocodes you seek. this new github repository contains three scripts:
download current hpsa table.R
- download the zipped health professional shortage areas file directly onto your computer
- load the primary care physician, dental, and mental health shortage area tables into r
- save each file as an r data file (.rda) for fast loading later
identify point-in-time geographic hpsas.R
- limit the primary care physician hpsa data to only records actively designated on the date specified
- further limit the data to only geographic areas (and possibly certain population groups)
- if included: create flags for each of the major population groups
- extract and save county-, minor civil division-, and census tract-level records from this original file
replicate hrsa nationwide statistics.R
- excel_round, floor, and subset as specified during conversations with the friendly folks at hrsa
- precisely match all five numbers in the top row of hrsa's current designation statistics table
click here to view these three scripts
for more detail about health professional shortage areas, visit:
notes:
depending on your goals and motivations, the cartography tools at hrsa's data n statistics pages might have all you really want. loading the microdata into r is for the tinkerers among us.
for a mapping between minor civil divisions and other geographies, check out the missouri census data center's geocorr12. depending on what geographies you have available in the data set you're merging hpsa data to, you'll have to make some decisions about what to do when one minor civil division overlaps two or more, say, zip code tabulation areas. geocorr12 provides an `afact` column, which is just the percent of the population of the geography you're mapping from that lives in the geography you're mapping to. woo hoo.
confidential to sas, spss, stata, and sudaan users: why run a three-legged race when you can sprint like an olympian? time to transition to r. :D