Skip to content

ed-dehaan/sumhdfe

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

93 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

sumhdfe

Sumhdfe is a Stata package that produces summary and diagnostic information of linear fixed effect models.

You can use sumhdfe to:

  • Check the frequency of fixed effects
  • Check the number of groups that have no variation within fixed effects
  • Check the residual within-fixed-effect variation of the regression variables
  • Generate publication-ready Word and Latex tables for all fixed-effects diagnostics

Sumhdfe is currently in beta version and we welcome comments and suggestions in the issue tab!

For a discussion on the issues that sumhdfe addresses, see deHaan (2021).
Similarly, if you find these diagnostics to be useful, please cite:

deHaan, Ed. (2021). Using and Interpreting Fixed Effects Models.
Available at SSRN: https://ssrn.com/abstract=3699777.


Authors

Table of contents


Installing sumhdfe

Sumhdfe is an extension to reghdfe and requires version 6+ of reghdfe and ftools to work. In order to generate .rtf files you also need to have rtfutil installed.

To install sumhdfe and its dependencies follow the steps below:

* Uninstall any old versions of ftools, reghdfe, sumhdfe
cap ado uninstall ftools     
cap ado uninstall reghdfe     
cap ado uninstall sumhdfe

* Install the most recent version of ftools, reghdfe, and sumhdfe
net install ftools, from("https://raw.githubusercontent.com/sergiocorreia/ftools/master/src/")
net install reghdfe, from("https://raw.githubusercontent.com/sergiocorreia/reghdfe/master/src/")
net install sumhdfe, from("https://raw.githubusercontent.com/ed-dehaan/sumhdfe/master/src/")

* To generate rtf files you also need to install rtfutil
ssc install rtfutil

Note: sumhdfe does not work with reghdfe version 5, which is the version that is installed by when running ssc install reghdfe.
Make sure to use the commands above to install reghdfe version 6.


Usage & Features

Example usage

Sumhdfe can be used in one of two ways:

  1. As a postestimation command following reghdfe
  2. As a standalone command

Post-estimation version

First run reghdfe and then run sumhdfe. A simple example is show below, see the Stata help file for additional examples.

use "https://raw.githubusercontent.com/ed-dehaan/sumhdfe/master/sumhdfe_demo_data.dta", clear
reghdfe y x1 x2  , a(firm year) 
sumhdfe

Standalone version

Run sumhdfe directly.

use "https://raw.githubusercontent.com/ed-dehaan/sumhdfe/master/sumhdfe_demo_data.dta", clear
sumhdfe y x1 x2 , a(firm year)

Features

The sumhdfe command will provide four panels by default:

Additionally, sumhdfe can provide:


Panel A - Summary statistics

Summary statistics for the sample used in reghdfe.

Example:

Notes:

  • It can be customized similar to estat summarize
  • N includes singletons, so it differs from N shown in the reghdfe output
  • When using the panels(str) option, this panel can be selected using the sum accronym: panels(sum)

Panel B - Summary statistics for fixed effects

Summary statistics for the fixed effects themselves.

Example:

Notes:

  • Interpretation of the above example:
    • There are 189 unique firms within the firm fixed effects, 28 of which are singletons (i.e., appear just once). An individual firm has between 1 and 8 observations.
    • There are 39 unique years within the year fixed effects, 8 of which are singletons.
    • Iterating across both firm and year eliminates 2 more "joint singletons," for a total of 38 singletons eliminated from the reghdfe output.
  • When using the panels(str) option, this panel can be selected using the fe accronym: panels(fe)

Panel C - Groups without any within fixed effect variation

Panel C quantifies how often each variable is constant within a given fixed effect group (such as within a given firm). These observations can have unexpected effects on regression coefficients and, if numerous, should be carefully evaluated.

Example:

Notes:

  • Interpretation of the above example:
    • Variable x1 has (623-38=) 585 observations excluding singletons.
    • Within the non-singleton data, 58 firms have no variation in x1; i.e., each firm has the same x1 in all years. Those 58 firms relate to 217 observations.
    • X1 is constant within 4 years, relating to 28 observations.
  • When using the panels(str) option, this panel can be selected using the zero accronym: panels(zero)

Panel D - Variation lost (absorbed) due to fixed effects

Panel D shows how much variation in each variable is lost (or absorbed) due to the fixed effects, in terms of both standard deviations and r-squared.

Example:

Notes:

  • Interpretation of the above example:
    • The standard deviation of x1 is 79.7 in the pooled sample (as also showed in Panel A), but the within-fixed-effect standard deviation of x1 is 22.7. Thus, the within-fixed effect variation of x1 is roughly 28.4% of the pooled sample.
    • In terms of r-squared, the firm fixed effects explain roughly 87% of the variation in x1 while the year fixed effects explain roughly 13%. Combined, the fixed effects explain 92.4% of the variation in x1.
      • Technical note: the r-squared is relative to the sample including singletons, for which the r-squared is mechanically equal to 100%.
  • When using the panels(str) option, this panel can be selected using the rss accronym: panels(rss)

Histogram

The histogram(#) option tabulates the frequencies of observations within a fixed effect grouping.

Example:

For example, sumhdfe, histogram(1) shows the frequencies of observations for the first fixed effect grouping listed within a(firm year), i.e., firm. You can also specify the fixed effect name; for example sumhdfe, histogram(year).


Publication ready tables

All panels can be exported to a publication ready RTF or Latex table. The RTF table can be used in Word or Excel (by copying the contents to an Excel sheet).

To export the tables:

  • First run sumhdfe
  • Run the sumhdfe_export command
    • You can optionally specify the panels you want using "panels(a b c d)"
    • For the export help file run help sumhdfe_export
    • The filename you pass to sumhdfe_export will determine the output, use .rtf or .tex

Example 1: RTF

reghdfe y x1 x2, a(firm year) 
sumhdfe
sumhdfe_export using table.rtf, panels(a b c d)

You can open the .rtf file using Word and you can copy the table to Excel as well.

Example 2: Tex

reghdfe y x1 x2, a(firm year) 
sumhdfe
sumhdfe_export using table.tex, panels(a b c d) standalone

You can render the .tex file using your prefered LaTeX editor (e.g., Overleaf).


Additional options

For additional examples and additional options, see the stata help file with help sumhdfe and help sumhdfe_export


Pending Items

  1. Allow for easy export of each table to Word/Excel/LaTeX
  2. Full walkthrough with real-word example
  3. Add an option to visually compare the pooled- and within-fixed-effect variation in a variable. In the meantime, it can be manually done as follows:
use "https://raw.githubusercontent.com/ed-dehaan/sumhdfe/master/sumhdfe_demo_data.dta", clear
qui: reghdfe y x1 x2, a(firm year)
qui: reghdfe x1 if e(sample), a(firm year) resid
twoway (histogram x1, fcolor(green%75) lcolor(none)) (histogram _reghdfe_resid, ///
fcolor(navy%70) lcolor(none)), legend(on order(1 "x1" 2 "within-FE x1"))

Questions and bug reports

If you have questions or experience problems please use the issues tab of this repository.

Known bugs:

  • The RTF file might have blank pages in the beginning or end if only a selection of panels are returned.

About

Summary and diagnostic information for evaluating within-fixed-effect variation.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published