Skip to content

Commit db56be6

Browse files
docs(usage): Finish parametric derivatives user guide
1 parent 4457c0c commit db56be6

File tree

4 files changed

+63
-19
lines changed

4 files changed

+63
-19
lines changed

docs/images/ug_derivative_fd.png

22.9 KB
Loading

docs/usage/derivatives.md

Lines changed: 60 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -10,15 +10,15 @@ under the constraint that $\mathbf{M}$ is pointwise s.p.d.
1010
In the following, we consider the scenario of having some loss functional
1111
$l: \mathbb{R}^{N_V} \to \mathbb{R},\ l = l(\mathbf{u})$, depending on the solution $\mathbf{u}(\mathbf{M})$
1212
of the eikonal equation for a specific tensor field $\mathbf{M}(\mathbf{m})$. In various problem settings,
13-
such as minimization of the loss, it is essential to be able to obtain gradients w.r.t. the input parameter
13+
such as minimization of the loss, it is essential to be able to obtain the gradient $\mathbf{g}$ w.r.t. the input parameter
1414
vector,
1515

1616
$$
17-
g(\mathbf{m}) = \frac{d l(\mathbf{u})}{d\mathbf{u}}\frac{d\mathbf{u}(\mathbf{M})}{d\mathbf{M}}\frac{d\mathbf{M}(\mathbf{m})}{d\mathbf{m}}.
17+
\mathbf{g}(\mathbf{m}) = \frac{d l(\mathbf{u})}{d\mathbf{u}}\frac{d\mathbf{u}(\mathbf{M})}{d\mathbf{M}}\frac{d\mathbf{M}(\mathbf{m})}{d\mathbf{m}}.
1818
$$
1919

2020
This is the scenario we cover in this tutorial. Eikonax follows a *discretize-then-optimize* approach
21-
to computing the gradient. Moreover, it efficiently computes discrete adjoints by exploitsing the causality in the
21+
to computing the gradient. Moreover, it efficiently computes discrete adjoints by exploiting the causality in the
2222
forward solution of the eikonal equation. A detailed description of this procedure is given
2323
[here][eikonax.derivator.DerivativeSolver].
2424

@@ -101,7 +101,18 @@ solution = eikonax_solver.run(parameter_field)
101101

102102
## Partial Derivatives
103103

104-
Evaluating the gradient $g(\mathbf{m})$ is a two-step procedure in Eikonax.
104+
Evaluating the gradient $g(\mathbf{m})$ is a two-step procedure in Eikonax. Firstly, we evaluate for a
105+
given parameter vector $\mathbf{m}$ and associated solution $\mathbf{u}$ the
106+
partial derivatives $\mathbf{G}_u$ and $\mathbf{G}_M$ of the global update operator in the
107+
[iterative solver][eikonax.solver.Solver]. This can be done with the
108+
[`PartialDerivator`][eikonax.derivator.PartialDerivator] object. Its configuration in
109+
[`PartialDerivator`][eikonax.derivator.PartialDerivatorData] is analogous to that of the
110+
[forward solver](./solve.md).
111+
112+
!!! note
113+
First derivatives do not require a differentiable transformation for the update parameter $\lambda$.
114+
Second derivatives do, however.
115+
105116
```py
106117
from eikonax import derivator
107118

@@ -113,27 +124,69 @@ derivator_data = derivator.PartialDerivatorData(
113124
eikonax_derivator = derivator.PartialDerivator(mesh_data, derivator_data, initial_sites)
114125
```
115126

127+
We can obtain sparse representations (through local dependencies) of the partial derivatives via
128+
the [`compute_partial_derivatives`][eikonax.derivator.PartialDerivator.compute_partial_derivatives]
129+
method,
116130
```py
117131
sparse_partial_solution, sparse_partial_tensor = \
118132
eikonax_derivator.compute_partial_derivatives(solution.values, parameter_field)
119133
```
120-
134+
$\mathbf{G}_u$ and $\mathbf{G}_M$ are both returned as intermediate representations in the form of
135+
tuples of vectors.
136+
Lastly, we compute $\mathbf{G}_m$ from the chain rule and under usage of the tensor field derivative
137+
$\frac{d\mathbf{M}}{d\mathbf{m}}$. This is handled by the tensor field component through the
138+
[`assemble_jacobian`][eikonax.tensorfield.TensorField.assemble_jacobian] method,
121139
```py
122140
sparse_partial_parameter = \
123141
tensor_field.assemble_jacobian(solution.values.size, sparse_partial_tensor, parameter_vector)
124142
```
143+
This operation returns a scipy [`coo_matrix`](https://scipy.github.io/devdocs/reference/generated/scipy.sparse.coo_matrix.html)
144+
which is readily usable for further algebraic operations.
125145

126146

127147
## Derivative Solver
128-
148+
With the partial derivative, we can not set up a sparse, triangular equation system for computing discrete adjoints,
149+
and subsequently the gradient $\mathbf{g}$. The rational behind this procedure is explained in more detail
150+
[here][eikonax.derivator.DerivativeSolver]. We set up the solver for the equation system with the
151+
[`DerivativeSolver`][eikonax.derivator.DerivativeSolver] component,
129152
```py
130153
derivative_solver = derivator.DerivativeSolver(solution.values, sparse_partial_solution)
131154
```
132155

156+
We can now evaluate the discrete adjoint from a given loss gradient $\frac{dl}{d\mathbf{u}}$,
133157
```py
134158
loss_grad = np.ones(solution.values.size)
135159
adjoint = derivative_solver.solve(loss_grad)
160+
```
161+
162+
We then obtain the gradient by simple multiplication of the adjoint with $\mathbf{G}_m$,
163+
164+
```py
136165
total_grad = partial_derivative_parameter.T @ adjoint
137166
```
138167

139-
## Comparison to Finite Differences
168+
!!! tip "Reusability of the gradient copmutation"
169+
The setup of the derivative solver and $\mathbf{G}_m$ basically constitutes a sparse representation
170+
of the entire parametric Jacobian $\mathbf{J}$ at a point $\mathbf{m}$. Once assembled, arbitrary portions of
171+
the Jacobian can be constructed with negligible cost, even for large systems.
172+
173+
## Comparison to Finite Differences
174+
175+
As a proof-of-concept, we compare the Jacobian $\mathbf{J}$ at the point $\mathbf{m}$ produced by Eikonax
176+
to forward finite differences. With Eikonax, we simply evaluate the gradient for all unit vectors
177+
$\mathbf{e}_i,\ i=1,\ldots,N_V$ as "loss gradient" $\frac{dl}{d\mathbf{u}}$. Each such computation
178+
yields a row $\mathbf{J}_i$ of the Jacobian. With finite difference, we evaluate a column $\mathbf{J}_j$
179+
of the jacobian as
180+
181+
$$
182+
\mathbf{J}_j = \frac{\partial\mathbf{u}}{\partial m_j} \approx \frac{\mathbf{u}(\mathbf{m} + h\mathbf{e}_j) - \mathbf{u}(\mathbf{m})}{h},
183+
$$
184+
185+
with $j=1,\ldots,N_S$, and $h$ denotes the step width of the finite difference scheme.
186+
187+
The figure below shows the difference of the Jacobian matricesk in Frobennius norm for $h\in[1\mathrm{e}{-5}, 1\mathrm{e}{-1}]$.
188+
As can be expected, the error decreases linearly for decreasing $h$ down to the square root of the floating point precision
189+
(32 bit in this case). Beyond this threshold, the error increases again due to round-off errors.
190+
<figure markdown>
191+
![samples](../images/ug_derivative_fd.png){ width="500" style="display: inline-block" }
192+
</figure>

examples/parametric_derivatives.ipynb

Lines changed: 2 additions & 12 deletions
Large diffs are not rendered by default.

pyproject.toml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,7 @@ keywords = ["eikonal equation", "automatic differentiation", "GPU acceleration"]
1111
classifiers = [
1212
"Development Status :: 3 - Alpha",
1313
"Operating System :: POSIX :: Linux",
14+
"Programming Language :: Python :: 3",
1415
"Environment :: GPU :: NVIDIA CUDA",
1516
"Intended Audience :: Science/Research",
1617
"License :: OSI Approved :: GNU General Public License v3 (GPLv3)",

0 commit comments

Comments
 (0)