@@ -10,15 +10,15 @@ under the constraint that $\mathbf{M}$ is pointwise s.p.d.
10
10
In the following, we consider the scenario of having some loss functional
11
11
$l: \mathbb{R}^{N_V} \to \mathbb{R},\ l = l(\mathbf{u})$, depending on the solution $\mathbf{u}(\mathbf{M})$
12
12
of the eikonal equation for a specific tensor field $\mathbf{M}(\mathbf{m})$. In various problem settings,
13
- such as minimization of the loss, it is essential to be able to obtain gradients w.r.t. the input parameter
13
+ such as minimization of the loss, it is essential to be able to obtain the gradient $\mathbf{g}$ w.r.t. the input parameter
14
14
vector,
15
15
16
16
$$
17
- g (\mathbf{m}) = \frac{d l(\mathbf{u})}{d\mathbf{u}}\frac{d\mathbf{u}(\mathbf{M})}{d\mathbf{M}}\frac{d\mathbf{M}(\mathbf{m})}{d\mathbf{m}}.
17
+ \mathbf{g} (\mathbf{m}) = \frac{d l(\mathbf{u})}{d\mathbf{u}}\frac{d\mathbf{u}(\mathbf{M})}{d\mathbf{M}}\frac{d\mathbf{M}(\mathbf{m})}{d\mathbf{m}}.
18
18
$$
19
19
20
20
This is the scenario we cover in this tutorial. Eikonax follows a * discretize-then-optimize* approach
21
- to computing the gradient. Moreover, it efficiently computes discrete adjoints by exploitsing the causality in the
21
+ to computing the gradient. Moreover, it efficiently computes discrete adjoints by exploiting the causality in the
22
22
forward solution of the eikonal equation. A detailed description of this procedure is given
23
23
[ here] [ eikonax.derivator.DerivativeSolver ] .
24
24
@@ -101,7 +101,18 @@ solution = eikonax_solver.run(parameter_field)
101
101
102
102
## Partial Derivatives
103
103
104
- Evaluating the gradient $g(\mathbf{m})$ is a two-step procedure in Eikonax.
104
+ Evaluating the gradient $g(\mathbf{m})$ is a two-step procedure in Eikonax. Firstly, we evaluate for a
105
+ given parameter vector $\mathbf{m}$ and associated solution $\mathbf{u}$ the
106
+ partial derivatives $\mathbf{G}_ u$ and $\mathbf{G}_ M$ of the global update operator in the
107
+ [ iterative solver] [ eikonax.solver.Solver ] . This can be done with the
108
+ [ ` PartialDerivator ` ] [ eikonax.derivator.PartialDerivator ] object. Its configuration in
109
+ [ ` PartialDerivator ` ] [ eikonax.derivator.PartialDerivatorData ] is analogous to that of the
110
+ [ forward solver] ( ./solve.md ) .
111
+
112
+ !!! note
113
+ First derivatives do not require a differentiable transformation for the update parameter $\lambda$.
114
+ Second derivatives do, however.
115
+
105
116
``` py
106
117
from eikonax import derivator
107
118
@@ -113,27 +124,69 @@ derivator_data = derivator.PartialDerivatorData(
113
124
eikonax_derivator = derivator.PartialDerivator(mesh_data, derivator_data, initial_sites)
114
125
```
115
126
127
+ We can obtain sparse representations (through local dependencies) of the partial derivatives via
128
+ the [ ` compute_partial_derivatives ` ] [ eikonax.derivator.PartialDerivator.compute_partial_derivatives ]
129
+ method,
116
130
``` py
117
131
sparse_partial_solution, sparse_partial_tensor = \
118
132
eikonax_derivator.compute_partial_derivatives(solution.values, parameter_field)
119
133
```
120
-
134
+ $\mathbf{G}_ u$ and $\mathbf{G}_ M$ are both returned as intermediate representations in the form of
135
+ tuples of vectors.
136
+ Lastly, we compute $\mathbf{G}_ m$ from the chain rule and under usage of the tensor field derivative
137
+ $\frac{d\mathbf{M}}{d\mathbf{m}}$. This is handled by the tensor field component through the
138
+ [ ` assemble_jacobian ` ] [ eikonax.tensorfield.TensorField.assemble_jacobian ] method,
121
139
``` py
122
140
sparse_partial_parameter = \
123
141
tensor_field.assemble_jacobian(solution.values.size, sparse_partial_tensor, parameter_vector)
124
142
```
143
+ This operation returns a scipy [ ` coo_matrix ` ] ( https://scipy.github.io/devdocs/reference/generated/scipy.sparse.coo_matrix.html )
144
+ which is readily usable for further algebraic operations.
125
145
126
146
127
147
## Derivative Solver
128
-
148
+ With the partial derivative, we can not set up a sparse, triangular equation system for computing discrete adjoints,
149
+ and subsequently the gradient $\mathbf{g}$. The rational behind this procedure is explained in more detail
150
+ [ here] [ eikonax.derivator.DerivativeSolver ] . We set up the solver for the equation system with the
151
+ [ ` DerivativeSolver ` ] [ eikonax.derivator.DerivativeSolver ] component,
129
152
``` py
130
153
derivative_solver = derivator.DerivativeSolver(solution.values, sparse_partial_solution)
131
154
```
132
155
156
+ We can now evaluate the discrete adjoint from a given loss gradient $\frac{dl}{d\mathbf{u}}$,
133
157
``` py
134
158
loss_grad = np.ones(solution.values.size)
135
159
adjoint = derivative_solver.solve(loss_grad)
160
+ ```
161
+
162
+ We then obtain the gradient by simple multiplication of the adjoint with $\mathbf{G}_ m$,
163
+
164
+ ``` py
136
165
total_grad = partial_derivative_parameter.T @ adjoint
137
166
```
138
167
139
- ## Comparison to Finite Differences
168
+ !!! tip "Reusability of the gradient copmutation"
169
+ The setup of the derivative solver and $\mathbf{G}_ m$ basically constitutes a sparse representation
170
+ of the entire parametric Jacobian $\mathbf{J}$ at a point $\mathbf{m}$. Once assembled, arbitrary portions of
171
+ the Jacobian can be constructed with negligible cost, even for large systems.
172
+
173
+ ## Comparison to Finite Differences
174
+
175
+ As a proof-of-concept, we compare the Jacobian $\mathbf{J}$ at the point $\mathbf{m}$ produced by Eikonax
176
+ to forward finite differences. With Eikonax, we simply evaluate the gradient for all unit vectors
177
+ $\mathbf{e}_ i,\ i=1,\ldots,N_V$ as "loss gradient" $\frac{dl}{d\mathbf{u}}$. Each such computation
178
+ yields a row $\mathbf{J}_ i$ of the Jacobian. With finite difference, we evaluate a column $\mathbf{J}_ j$
179
+ of the jacobian as
180
+
181
+ $$
182
+ \mathbf{J}_j = \frac{\partial\mathbf{u}}{\partial m_j} \approx \frac{\mathbf{u}(\mathbf{m} + h\mathbf{e}_j) - \mathbf{u}(\mathbf{m})}{h},
183
+ $$
184
+
185
+ with $j=1,\ldots,N_S$, and $h$ denotes the step width of the finite difference scheme.
186
+
187
+ The figure below shows the difference of the Jacobian matricesk in Frobennius norm for $h\in[ 1\mathrm{e}{-5}, 1\mathrm{e}{-1}] $.
188
+ As can be expected, the error decreases linearly for decreasing $h$ down to the square root of the floating point precision
189
+ (32 bit in this case). Beyond this threshold, the error increases again due to round-off errors.
190
+ <figure markdown >
191
+ ![ samples] ( ../images/ug_derivative_fd.png ) { width="500" style="display: inline-block" }
192
+ </figure >
0 commit comments