Skip to content

Commit

Permalink
DOC: Add explanations for Durbin-Watson and Kurtosis. Include DW-test…
Browse files Browse the repository at this point in the history
… and Breusch-Godfrey test for autocorrelation. Reorder some parts for better readability
  • Loading branch information
luke396 committed Apr 23, 2024
1 parent c22837f commit adc9553
Showing 1 changed file with 75 additions and 31 deletions.
106 changes: 75 additions & 31 deletions examples/notebooks/regression_diagnostics.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -4,23 +4,23 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"# Regression diagnostics"
"# Regression diagnostics\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This example file shows how to use a few of the ``statsmodels`` regression diagnostic tests in a real-life context. You can learn about more tests and find out more information about the tests here on the [Regression Diagnostics page.](https://www.statsmodels.org/stable/diagnostic.html)\n",
"This example file shows how to use a few of the `statsmodels` regression diagnostic tests in a real-life context. You can learn about more tests and find out more information about the tests here on the [Regression Diagnostics page.](https://www.statsmodels.org/stable/diagnostic.html)\n",
"\n",
"Note that most of the tests described here only return a tuple of numbers, without any annotation. A full description of outputs is always included in the docstring and in the online ``statsmodels`` documentation. For presentation purposes, we use the ``zip(name,test)`` construct to pretty-print short descriptions in the examples below."
"Note that most of the tests described here only return a tuple of numbers, without any annotation. A full description of outputs is always included in the docstring and in the online `statsmodels` documentation. For presentation purposes, we use the `zip(name,test)` construct to pretty-print short descriptions in the examples below.\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Estimate a regression model"
"## Estimate a regression model\n"
]
},
{
Expand Down Expand Up @@ -61,14 +61,14 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"## Normality of the residuals"
"## Normality of the residuals\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Jarque-Bera test:"
"Omnibus test:\n"
]
},
{
Expand All @@ -77,16 +77,18 @@
"metadata": {},
"outputs": [],
"source": [
"name = [\"Jarque-Bera\", \"Chi^2 two-tail prob.\", \"Skew\", \"Kurtosis\"]\n",
"test = sms.jarque_bera(results.resid)\n",
"name = [\"Chi^2\", \"Two-tail probability\"]\n",
"test = sms.omni_normtest(results.resid)\n",
"lzip(name, test)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Omni test:"
"Jarque-Bera test:\n",
"\n",
"Kurtosis below is the sample kurtosis, not the excess kurtosis. A sample from the normal distribution has kurtosis equal to 3.\n"
]
},
{
Expand All @@ -95,18 +97,18 @@
"metadata": {},
"outputs": [],
"source": [
"name = [\"Chi^2\", \"Two-tail probability\"]\n",
"test = sms.omni_normtest(results.resid)\n",
"name = [\"Jarque-Bera test\", \"Chi^2 two-tail prob.\", \"Skew\", \"Kurtosis\"]\n",
"test = sms.jarque_bera(results.resid)\n",
"lzip(name, test)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Influence tests\n",
"## Multicollinearity\n",
"\n",
"Once created, an object of class ``OLSInfluence`` holds attributes and methods that allow users to assess the influence of each observation. For example, we can compute and extract the first few rows of DFbetas by:"
"Condition number:\n"
]
},
{
Expand All @@ -115,19 +117,20 @@
"metadata": {},
"outputs": [],
"source": [
"from statsmodels.stats.outliers_influence import OLSInfluence\n",
"\n",
"test_class = OLSInfluence(results)\n",
"test_class.dfbetas[:5, :]"
"name = [\"Conditon Number\"]\n",
"test = [np.linalg.cond(results.model.exog)]\n",
"lzip(name, test)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Explore other options by typing ``dir(influence_test)``\n",
"## Autorelation\n",
"\n",
"Durbin-Watson test:\n",
"\n",
"Useful information on leverage can also be plotted:"
"DW statistic always ranges from 0 to 4. The closer to 2, the less autocorrelation is in the sample.\n"
]
},
{
Expand All @@ -136,26 +139,57 @@
"metadata": {},
"outputs": [],
"source": [
"from statsmodels.graphics.regressionplots import plot_leverage_resid2\n",
"\n",
"fig, ax = plt.subplots(figsize=(8, 6))\n",
"fig = plot_leverage_resid2(results, ax=ax)"
"name = [\"Durbin-Watson statistic\"]\n",
"test = [sms.durbin_watson(results.resid)]\n",
"lzip(name, test)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Other plotting options can be found on the [Graphics page.](https://www.statsmodels.org/stable/graphics.html)"
"Breusch–Godfrey test for serial correlation:\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"name = [\"Breusch-Pagan Lagrange multiplier test statistic\", \"p-value\", \"f-value\", \"f p-value\"]\n",
"test = sms.acorr_breusch_godfrey(results)\n",
"lzip(name, test)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Multicollinearity\n",
"## Influence tests\n",
"\n",
"Once created, an object of class `OLSInfluence` holds attributes and methods that allow users to assess the influence of each observation. For example, we can compute and extract the first few rows of DFbetas by:\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from statsmodels.stats.outliers_influence import OLSInfluence\n",
"\n",
"test_class = OLSInfluence(results)\n",
"test_class.dfbetas[:5, :]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Explore other options by typing `dir(influence_test)`\n",
"\n",
"Condition number:"
"Useful information on leverage can also be plotted:\n"
]
},
{
Expand All @@ -164,7 +198,17 @@
"metadata": {},
"outputs": [],
"source": [
"np.linalg.cond(results.model.exog)"
"from statsmodels.graphics.regressionplots import plot_leverage_resid2\n",
"\n",
"fig, ax = plt.subplots(figsize=(8, 6))\n",
"fig = plot_leverage_resid2(results, ax=ax)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Other plotting options can be found on the [Graphics page.](https://www.statsmodels.org/stable/graphics.html)\n"
]
},
{
Expand All @@ -173,7 +217,7 @@
"source": [
"## Heteroskedasticity tests\n",
"\n",
"Breush-Pagan test:"
"Breush-Pagan test:\n"
]
},
{
Expand All @@ -191,7 +235,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"Goldfeld-Quandt test"
"Goldfeld-Quandt test\n"
]
},
{
Expand All @@ -211,7 +255,7 @@
"source": [
"## Linearity\n",
"\n",
"Harvey-Collier multiplier test for Null hypothesis that the linear specification is correct:"
"Harvey-Collier multiplier test for Null hypothesis that the linear specification is correct:\n"
]
},
{
Expand Down Expand Up @@ -242,7 +286,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.10"
"version": "3.10.13"
}
},
"nbformat": 4,
Expand Down

0 comments on commit adc9553

Please sign in to comment.