before you start crosstabbing and svymeaning, it'd be smart to spend ten minutes reading exhibit 6.2 of the user's guide so you understand how all the data tables (..that the download automation script imports for you..) work together. simpler analyses might only require the respondent and activity summary files, but at the point you want to determine who was with the respondent at soccer practice, you had better merge like a champ. of course before any of that, you'll need to decide which activity codes you actually want to capture. time spent calf-roping or cattle-riding? code 130121. commuting to the vet? code 180807. pumping gas? 070102. tired of me guessing for you? check out the activity coding lexicons. this new github repository contains four scripts:
download all microdata.R
- decipher the bls ftp site to download each year-specific (or multi-year) table
- unzip whatcha need, then import the microdata in a jiffy with read.csv
- save each file as an r data file (.rda) into neatly-sorted atus directories
2012 single-year - analysis examples.R
- load the activity, respondent, roster, and replicate weights files into working memory
- aggregate activity events to the respondent by top tier-level, then reshape it into one-record-per-person
- convert minutes to hours, merge all files into one data.frame, recode a smidgen
- create a replicate-weighted survey design object, with the bls-specified fay's adjustment
- perform one fine slew of analysis examples, including quite a few of these bls statistics
replicate bls standard error - 2007.R
- load the activity, activity-summary, respondent, and replicate weights files into working memory
- subset the activity summary table to only the television-related events table
- aggregate the activity table to the respondent-level as an example of an alternative to the previous method
- merge the minutes-spent-watching-television table with the respondent and replicate weights tables
- create a replicate-weighted survey design object, with the bls-specified fay's adjustment
- precisely replicate the bureau of labor statistics' standard error of hours per day spent watching the teevee
replicate bls example one - 2006.R
- load the activity and respondent data tables into working memory
- subset the activity table to only care of household children events (as prescribed by the 2006 lexicon)
- aggregate that activity table to the respondent-level, then merge those minutes to the respondent data.
- just run a weighted.mean that skips any variance calculation but hits the bls example one on the nose
click here to view these four scripts
for more detail about the american time use survey, visit:
- the questionnaire, transmogrified for public dissemination
- summary charts and tables provided by the bureau of labor statistics
notes:
just like the medical expenditure panel survey draws its sample from the national health interview survey, the american time use survey is a subsample of current population survey (cps) respondents. in fact, the microdata include a handy atus-cps mergefile. unlike the cps, it's not a household survey - only one individual at least 15 years of age gets selected from each sampled household. another important difference from the cps: the atus should not be used to draw state-level conclusions. atus generalizes to the united states non-institutional, non-active duty military population aged fifteen or more, but don't zoom in on geographies smaller than census regions.
when you see the svytotal function used in the analysis example script, you'll notice overall sums around ninety billion. that's because the survey weights in this data set actually generalize to person-days. divide by 365, and you'll almost precisely hit the `sixteen and older` row of the `2010 column` of table 1 on this census bureau age by sex table. so at ease, everybody. at ease.
confidential to sas, spss, stata, and sudaan users: if you want to impress people at parties with an antiquated skill, learn morse code. at least it's rhythmic. time to transition to r. :D