Skip to content

Commit

Permalink
move remaining chapter notebooks into subdirectories
Browse files Browse the repository at this point in the history
fix broken link to zmp
a little progress on stochastic control notes
  • Loading branch information
RussTedrake committed Apr 25, 2024
1 parent b1f3fb8 commit ffded1c
Show file tree
Hide file tree
Showing 14 changed files with 116 additions and 766 deletions.
18 changes: 0 additions & 18 deletions book/BUILD.bazel
Expand Up @@ -53,11 +53,6 @@ rt_html_test(
rt_html_test(
srcs = ["stochastic.html"],
)
rt_ipynb_test(
name = "stochastic",
srcs = ["stochastic.ipynb"],
deps = ["//underactuated"],
)

rt_html_test(
srcs = ["dp.html"],
Expand Down Expand Up @@ -110,11 +105,6 @@ rt_html_test(
rt_html_test(
srcs = ["limit_cycles.html"],
)
rt_ipynb_test(
name = "limit_cycles",
srcs = ["limit_cycles.ipynb"],
deps = ["//underactuated"],
)

rt_html_test(
srcs = ["contact.html"],
Expand All @@ -140,14 +130,6 @@ rt_html_test(
srcs = ["multibody.html"],
)

rt_ipynb_test(
name = "multibody",
srcs = ["multibody.ipynb"],
deps = [
requirement("drake"),
],
)

rt_html_test(
srcs = ["optimization.html"],
)
Expand Down
2 changes: 1 addition & 1 deletion book/humanoids.html
Expand Up @@ -396,7 +396,7 @@ <h1><a href="index.html" style="text-decoration:none;">Underactuated Robotics</a

<p>The implementation that we used in the DARPA Robotics Challenge is described in
<elib>Tedrake15</elib> and is available in Drake as <a
href="https://drake.mit.edu/doxygen_cxx/classdrake_1_1systems_1_1controllers_1_1_zmp_planner.html#afbd73e15fe7be53f20440245d9b562ad"><code>ZmpPlanner</code></a>.</p>
href="https://drake.mit.edu/doxygen_cxx/classdrake_1_1planning_1_1_zmp_planner.html"><code>ZmpPlanner</code></a>.</p>

<example><h1>The ZMP Planner</h1>

Expand Down
2 changes: 1 addition & 1 deletion book/humanoids/BUILD.bazel
Expand Up @@ -23,7 +23,7 @@ rt_ipynb_test(
# deps = ["//underactuated"],
#)

# TODO(russt): Waiting for drake PR #21305 to merge.
# TODO(russt): Waiting for next drake version bump.
#rt_ipynb_test(
# name = "zmp_planner",
# srcs = ["zmp_planner.ipynb"],
Expand Down
5 changes: 4 additions & 1 deletion book/index.html
Expand Up @@ -251,7 +251,7 @@ <h1>Table of Contents</h1>
<li><a href="stochastic.html">Chapter 6: Model Systems
with Stochasticity</a></li>
<ul>
<li><a href=stochastic.html#section1>The Master Equation</a></li>
<li><a href=stochastic.html#master>The Master Equation</a></li>
<li><a href=stochastic.html#section2>Stationary Distributions</a></li>
<li><a href=stochastic.html#section3>Extended Example: The Rimless Wheel on Rough
Terrain</a></li>
Expand Down Expand Up @@ -504,6 +504,9 @@ <h1>Table of Contents</h1>
<li><a href=robust.html#section1>Stochastic models</a></li>
<li><a href=robust.html#section2>Costs and constraints for stochastic systems</a></li>
<li><a href=robust.html#section3>Finite Markov Decision Processes</a></li>
<ul>
<li>Dynamics of a Markov chain</li>
</ul>
<li><a href=robust.html#section4>Linear optimal control</a></li>
<ul>
<li>Stochastic LQR</li>
Expand Down
14 changes: 14 additions & 0 deletions book/limit_cycles/BUILD.bazel
@@ -0,0 +1,14 @@
# -*- mode: python -*-
# vi: set ft=python :

# Copyright 2020-2022 Massachusetts Institute of Technology.
# Licensed under the BSD 3-Clause License. See LICENSE.TXT for details.

load("@pip_deps//:requirements.bzl", "requirement")
load("//book/htmlbook/tools/jupyter:defs.bzl", "rt_ipynb_test")

rt_ipynb_test(
name = "limit_cycles",
srcs = ["limit_cycles.ipynb"],
deps = ["//underactuated"],
)
File renamed without changes.
16 changes: 16 additions & 0 deletions book/multibody/BUILD.bazel
@@ -0,0 +1,16 @@
# -*- mode: python -*-
# vi: set ft=python :

# Copyright 2020-2022 Massachusetts Institute of Technology.
# Licensed under the BSD 3-Clause License. See LICENSE.TXT for details.

load("@pip_deps//:requirements.bzl", "requirement")
load("//book/htmlbook/tools/jupyter:defs.bzl", "rt_ipynb_test")

rt_ipynb_test(
name = "multibody",
srcs = ["multibody.ipynb"],
deps = [
requirement("drake"),
],
)
File renamed without changes.
65 changes: 57 additions & 8 deletions book/robust.html
Expand Up @@ -213,19 +213,68 @@ <h1><a href="index.html" style="text-decoration:none;">Underactuated Robotics</a

<section><h1>Finite Markov Decision Processes</h1>

<p>The Bellman equation.</p>

<p>Discounted cost. Infinite-horizon average cost.</p>
<todo>Finalize if I want to use p_n(s) or \Pr_n(s) here, and use it throughout the
notes.</todo>

<p>We already had quick preview into stochastic optimal control in one of the cases
where it is particularly easy: <a href="dp.html#mdp">finite Markov Decision
Processes (MDPs)</a>.</p>

<p>Recall that in this setting, where the state space and action spaces are finite,
we write the dynamics completely in terms of the transition probabilities
\[p(s[n+1] = s' | s[n] = s, a[n] = a). \] Conveniently, we can represent the
entire state probability distribution as a vector, $p_n(s[n] = s),$ and write the
dynamics as a transition matrix for each action: $$T_{i,j}(a) = p(s_j | s_i, a).$$
The stochastic dynamics of the <a href="stochastic.html#master">master equation</a>
are then given by $$p_{n+1}(s') = {\bf T}(a[n]) p_{n}(s).$$ Take a moment to
appreciate this -- if we make a discrete-state / discrete-action approximation of
even the most complicated and rich nonlinear dynamical system, then the dynamics of
the state probability distribution are defined by an (action-dependent) linear map.
Note that ${\bf T}$ is not an arbitrary matrix of real numbers but rather a (left)
"<a href="https://en.wikipedia.org/wiki/Stochastic_matrix">stochastic matrix</a>":
we always have the $T_{ij} \in [0, 1]$ and $\sum_{i}T_{ij} = 1.$ </p>

<subsection><h1>Dynamics of a Markov chain</h1>

<!-- should this moved into stochastic.html? -->
<p>The dynamics of a closed-loop system (e.g. with no action inputs, or a fixed
policy, $\pi$), then the MDP equations reduce back to the simpler form of a Markov
chain: $$p_{n+1}(s') = {\bf T} p_{n}(s).$$ To evaluate the long-term dynamics of a
Markov chain, we have $$p_{n}(s') = {\bf T}^n p_{0}(s).$$ The eigenvalues of a
stochastic matrix are also bounded: $\forall i, \lambda_i \in [0, 1].$ Moreover,
it can be shown that every stochastic matrix has at least one eigenvalue of $1$;
any eigenvector corresponding to an eigenvalue of 1 is a <i>stationary
distribution</i> (a fixed point of the master equation) of the Markov chain. If
the Markov chain is "irreducible" and "aperiodic", then the stationary
distribution is unique and the Markov chain converges to it from any initial
condition.</p>

<example><h1>Discretized cubic polynomial w/ noise</h1>

<p>I used the discrete-time approximation of <a
href="stochastic.html#cubic">cubic polynomial with Gaussian noise</a> as an
example when we were first building our intuition about stochastic dynamics.
Let's now make a finite-state Markov chain approximation of those dynamics, by
discretizing the state space in 100 bins over the domain $x \ in [-2, 2].$</p>

<p>Let's examine the eigenvalues of this stochastic transition matrix...</p>

<todo>Finish coding up the example...</todo>

</example>
<todo>example: discretized cubic polynomial (bistable version) + additive Gaussian noise</todo>

<todo>Metastability</todo>

<p>We already had quick preview into stochastic optimal control in one of
the cases where it is particularly easy: <a href="dp.html#mdp">finite
Markov Decision Processes (MDPs)</a>.</p>
<todo>example: discretized cubic polynomial negated. Metastability. Rimless wheel on rough terrain.</todo>

<todo>Perhaps a example of something other than expected-value cost
(worst case, e.g. Katie's metastability?)</todo>
</subsection>

<todo>Discounted cost. Infinite-horizon average cost.</todo>

<todo>Perhaps a example of something other than expected-value cost
(worst case, e.g. Katie's metastability?)</todo>

</section>

<section><h1>Linear optimal control</h1>
Expand Down

0 comments on commit ffded1c

Please sign in to comment.