Implicit differentiation of root solvers
Implicit function theorem
Given an implicit variable $x$ defined through the root $F(x(\theta),\theta)=0$, one can apply the implicit function theorem (IFT) to obtain the gradients of $x$ wrt $\theta$
\[\frac{d F}{d \theta} = \frac{\partial F}{\partial x} \frac{d x}{d \theta} + \frac{\partial F}{\partial \theta} = 0 \implies \frac{d x}{d \theta} = - \left(\frac{\partial F}{\partial x}\right)^{-1} \frac{\partial F}{\partial \theta}\]
To avoid the unrolling of iterations, computing the derivative at each iteration and hence populating the AD stack unnecessarily, Clapeyron currently implements the IFT using ForwardDiff.jl through the manual manipulation of Duals. Currently only first order AD is supported. Higher order derivatives through nested Duals or Duals with different tags will error. The current AD implementation is:
Clapeyron.__gradients_for_root_finders — Function
__gradients_for_root_finders(x::AbstractVector{T},tups::Tuple,tups_primal::Tuple,f::Function) where T<:RealComputes the gradients of x with respect to the relevant parameters in tups under the condition that x is implicitly defined through the root finding problem f(x,tups) = 0. The function uses the implicit function theorem to compute the gradients efficiently through the reconstruction of Duals.
Clapeyron previously implemented a final Newton step to propagate first order derivatives, which is also limited to first order AD. Although this method does produce numeric values for nested Duals, these higher order derivatives are incorrect as it neglects the indirect effects from lower order gradients. For a more complete explanation see this discussion.