Skip to content
/ scfses Public

Generates point estimates and standard errors of percentiles/means for variables in the Survey of Consumer Finances (SCF).

Notifications You must be signed in to change notification settings

crafkin/scfses

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

24 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Overview

The Stata program scfses obtains accurate point estimates and standard errors of an arbitrary percentile (or the mean) of a variable in the Survey of Consumer Finances (SCF). For example, scfses can help you easily obtain the median and standard error on the median. It incorporates weights and accounts for both imputation variability and sampling variability.

Updates

Update March 8, 2018: Fixed an error in computing standard errors on means.

Update March 13, 2018: Option changed from nodofcorr to nodfcorr for naming consistency. Help file updated with more explanation.

Update November 6, 2018: Added more guidance regarding the degrees of freedom correction.

Installation and Set Up

To install the program, run

net install scfses, from(https://raw.github.com/crafkin/scfses/master/) replace

in Stata.

Alternatively, download scfses.ado and scfses.sthlp and place them in your PLUS folder. (To find your PLUS folder, run sysdir list in Stata.)

To download the SCF dataset, go to the SCF website. Download both the "Main Survey Data" and the "Replicate Weight File."

For example, you can download the 2016 SCF data, the 2016 replicate weights file, and the 2016 summary extract (a version of useful SCF variables cleaned from the microdata). Once you merge the replicate weights file with the main dataset and generate a variable indexing the implicates, you can use scfses to analyze variables' distributions.

Usage

Usage notes are documented in detail in the Stata help file.

  • It is not straightforward to generate standard errors on the mean or a specified percentile of the unconditional distribution of an SCF variable. If you are not careful, your standard errors and confidence intervals may be too small. scfses follows SCF guidance on combining imputation variability and sampling variability to obtain standard errors.

  • scfses estimates imputation variability by computing the sample variance of within-implicate point estimates. scfses estimates sampling variability by using the information contained in the replicate draw variables to construct the distribution of the variable from the first implicate; the sample variance of that distribution represents sampling variability. The program combines imputation and sampling variability following the SCF guidance.

  • Confidence intervals incorporate a degrees-of-freedom correction (Barnard and Rubin 1999) to account for imputation variance.

  • scfses stores point estimates and standard errors for post-estimation analysis.

  • scfses requires a vector of replicate sampling variables and replicate weight variables — one for each replicate used to compute sampling variance.

  • SCF recommends the command scfcombo (written by Jane Brittingham) for generating means and their standard errors. scfcombo may be useful for other applications (and some ideas in scfses were inspired by scfcombo). But scfses has the following advantages for summarizing the data:

  1. scfses makes it easy to generate point estimates and standard errors on an arbitrary percentile (which, to my knowledge, scfcombo cannot do without some modification)
  2. scfses incorporates a degrees-of-freedom correction for confidence intervals.

Additional Notes Re: Degrees-of-Freedom Correction

  • scfses incorporates, by default, a degrees-of-freedom correction for obtaining a confidence interval around your test statistic. Stata has the convention of reporting a 95% confidence interval (e.g. in regression coefficients) that tests the statistic against the t distribution. That test is only exact if the variable is normally distributed, but it is conservative otherwise. scfses, by default, constructs the 95% confidence interval using the t distribution, but the user has the option to test against the normal distribution instead.

  • Many variables in the SCF may not be normally distributed, and hence the user may wish to turn off the degrees of freedom correction using the option nodofcorr.

  • In general, the degrees-of-freedom correction is likely to make very little difference, given how quickly the t distribution with sufficient degrees of freedom approaches the normal distribution.

Author

Charlie Rafkin
National Bureau of Economic Research
crafkin@nber.org

Program developed to obtain estimates in:
Beshears, John, James Choi, David Laibson, and Brigitte C. Madrian. "Household Finance." In Handbook of Behavioral Economics, edited by B. Douglas Bernheim, Stefano DellaVigna, and David Laibson. Elsevier: 2018.

About

Generates point estimates and standard errors of percentiles/means for variables in the Survey of Consumer Finances (SCF).

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published