this data set began life as the nationwide personal transportation survey (npts), so if you see that title somewhere, just think of it as nhts classic. the latest main data files provide cross-sectional, nationally representative data on persons and households including their vehicles, all trips made in one assigned travel day, and their neighborhoods. (think of a trip as one-way travel between an origin - like home - and a destination - like work.) in addition to the national sample, many state departments of transportation and regional transportation planning agencies fund add-on samples so that descriptive statistics can be calculated at finer geographies. and since the person-level data contain detailed demographics, it's feasible to analyze travel behavior of the young, the elderly, people of color, and low-income folks, etc. etc. good luck trying to do that with smaller-scale transit surveys. that said, be cautious when generating estimates at the sub-national level; check out the weighting reports to get a sense of which geographies have sufficient sample size.
before you start editing our code and writing your own, take some time to familiarize yourself with the user guide and other relevant documents (such as their glossary of terms or how they create constructed variables) on their comprehensive publications table. each single-year release year comprises four files: person-level (age, sex, internet shopping behavior), household-level (size, number of licensed drivers), vehicle-level (make, model, fuel type), and travel day-level (trip distance, time starting/ending, means of transportation). the download automation script merges each file with its appropriate replicate-weight file, so if you wish to attach household-level variables onto the person-level file, ctrl+f search through that script for examples of how to create additional _m_ (merged) files. this new github repository contains three scripts:
download and import.R
- initiate the monet database with new monetdblite
- download, unzip, and import each year specified by the user
- merge on the weights wherever the weights need to be merged on
- create and save the taylor-series linearization complex sample designs
- create a well-documented block of code to re-initiate the monetdb server in the future
analysis examples.R
- re-initiate the monetdb server
- load the r data file (.rda) containing the replicate-weighted design for the person-level 2009 file
- perform the standard repertoire of analysis examples
replicate ornl.R
- re-initiate the monetdb server
- load the r data file (.rda) containing the replicate-weighted design for the 2009 person-level file
- replicate statistics from "table 1" of oak ridge national laboratory's example output document
click here to view these three scripts
notes:
data from the 1969 and 1977 national personal transportation survey (the nhts predecessor) are not available online. replicate weights were added beginning with the 2001 release. the 1983, 1990 and 1995 survey years contain only the overall weight and no way to accurately estimate the variance, so if you'd like to create a survey object that will give you incorrect standard errors, you might copy the taylor-series linearization object creation at the very bottom of the us decennial census public use microdata sample's helper functions, but don't for a second trust the confidence intervals that produces. if you'd like either of those things to change, it can't hurt to ask.
confidential to sas, spss, stata, sudaan users: honk if you love r :D