Bias and Variance Tradeoff

The following equation represents the expected out-of-sample error in terms of $\bar{g}$ which is the 'average function' which can be interpreted as: Generate many data sets and apply the learning algorithm to each data set to produce final hypotheses . We can then estimate the average function for any by $\bar{g}(x)\approx \frac{1}{k} \sum_{k=1}^{K} g_k(x)$ . Essentialy, we are viewing as a random variable, with the randomness coming from the randomness in the dataset; $\bar{g}(x)$ is the expected value of this random variable (for a particular ), and $\bar{g}$ is a function, the average function, composed of these expected values.

$E_D[(g^D(x) - f(x))^2] = E_D[(g^D(x) - \bar{g}(x))^2] + (\bar{g}(x) - f(x))^2$

The term $(\bar{g}(x)-f(x))^2$ measures how much the average function that we would learn using different data sets deviates from the target function that generated these data sets. This term is called bias.

$bias(x) = (\bar{g}(x) - f(x))^2$

As it measures how much our learning model is biased away from the target function. This is because $\bar{g}$ has the benefit of learning from an unlimited number of datasets, so it is only limited by its ability to approximate by the limitation in the model learning itself.

The term $E_D[(g^D(x) - \bar{g}(x))^2]$ is the variance of the random variable .

$var(x) = E_D[(g^D(x) - \bar{g}(x))^2]$

The variance measures the variation in the final htpothesis, depending on the data set. We thus arrive at the bias-variance decomposition of out-of-sample error.

$E_D[E_{out}(g^D)] = E_x[bias(x) + var(x)] = bias + var$

Bias_variance_fx_b.py

Considering the target function $f(x) = sin(\pi x)$ and a datset of size . We sample uniformly in [-1, 1] to generate a data set , .

Fit the model using:

: Set of all lines of the form

For , we choose the constant hypothesis that best fits the data (the horizontal line at the midpoint, $b = \frac{y_1+y_2}{2}$ ).

Bias_variance_fx_ax_b.py

Consider a target function $f(x) = sin(\pi x)$ and a data set of size . We sample uniformly in [-1, 1] to generate a data set , .

Fit the model using:

: Set of all lines of the form

With , the learned hypothesis is wilder and varies extensively depending on the dataset.

Abu-Mostafa, Y. S., Magdon-Ismail, M., & Lin, H. T. (2012). Learning from data (Vol. 4). New York, NY, USA:: AMLBook.

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
Bias_variance_fx_ax_b.py		Bias_variance_fx_ax_b.py
Bias_variance_fx_b.py		Bias_variance_fx_b.py
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bias_variance_fx_ax_b.py

Bias_variance_fx_ax_b.py

Bias_variance_fx_b.py

Bias_variance_fx_b.py

README.md

README.md

Repository files navigation

Bias and Variance Tradeoff

Bias_variance_fx_b.py

Bias_variance_fx_ax_b.py

About

Releases

Packages

Languages

no-lineal/bias_variance

Folders and files

Latest commit

History

Repository files navigation

Bias and Variance Tradeoff

Bias_variance_fx_b.py

Bias_variance_fx_ax_b.py

About

Topics

Resources

Stars

Watchers

Forks

Languages