Automatic Differentiation

MITgcm's tangent linear and adjoint model

Automatic differentiation (AD), also referred to as algorithmic (or, more loosely, computational) differentiation, is a technology for automatically augmenting computer programs, including arbitrarily complex simulations, with statements for the computation of derivatives, also known as sensitivities. AD tools in our context provide source-to-source transformation, according to a set of linguistic and mathematical rules, of a function (a nonlinear prognostic model), given as computer code, to generate efficient and accurate (truncation-free) readable code for computing derivatives of the given function.  Thus, unlike a pure source-to-source translation, the output program
represents a new algorithm, such as the evaluation of the Jacobian, the Hessian, or higher derivative operators. In principle, a variety of derived algorithms can be generated automatically in this way.

In the mid 1990's groups in the CMI at MIT, SIO, JPL and GFDL have begun to apply AD tools for generating tangent linear and adjoint code for ocean circulation and climate studies.  The tools used comprised the Tangent linear and Adjoint Model Compiler (TAMC) as well as its commercial successor, Transformation of Algorithms in Fortran (TAF), developed by Ralf Giering (\cite{gie-kam:98}, \cite{gie:99}, \cite{gie:00a}), for which the MITgcm code has been adapted.

TAF exploits the chain rule for computing the first derivative of a function with respect to a set of input variables. Treating a given forward code as a composition of operations -- each line representing a compositional element, the chain rule is rigorously applied to the code, line by line. The resulting tangent linear or adjoint code, then, may be thought of as the composition in forward or reverse order, respectively, of the Jacobian matrices of the forward code's compositional elements.

This system has so far been used and is being applied in a practical way to study five broad classes of problems:
parameter sensitivity of the climate system,
initial and boundary value sensitivity
global ocean state estimation
optimal observing design studies
singular vector / optimal perturbation studies

Required AD tool features

The generation of efficient tangent linear and adjoint code for the MITgcm has pushed the limits of AD. It makes extensive use of advanced AD tool features, and has driven, in the case of TAF, AD tool development and improvement.  Among the he required features are:
handling of nonlinearities, switches, etc.
balancing of storing vs. re-computation via user directives
n-level check-pointing (here, n=3)
flow directives for substituting/providing hand-written derivative routines into the AD-generated derivative code, in particular related to the WRAPPER
exploitation of self-adjointness
inclusion of the Message Passing Interface (MPI) library for code execution in parallel environments
adjoint dump & restart, requiring the definition of the full adjoint state

ADM and TLM maintenance and improvement

Some key advantages of AD for derivative code generation are:
up-to-date maintenance of derivative code along with the ongoing forward code development as is the case for MITgcm; at the time of writing, the available ADM and TLM are based on code that lacks by only 3 weeks the latest available MITgcm checkpoint.
incremental modifications of the forward code (rather than extensive rewriting) when new features are added (new 'physics' packages, modified/improved numerics and dynamic, etc.); as an example, the inclusion of the adjoint of the Gent/McWilliams parameterization scheme (GM/Redi) requires the adjoint state to propagate through vertical density gradients, thus modifying the flow of adjoint sensitivities compared to a configuration without GM/Redi.

The benefits of such a strategy are illustrated in the context of two directions of the ongoing model development:

Atmospheric code development

As described in {mars-eta:03,adc-eta:03}, pressure vs. z-level coordinate isomorphisms are being exploited to enable the MITgcm to run as an atmospheric model.  Relying on the same hydrodynamical kernel for which an adjoint model is available readily leads to an adjoint model for an atmospheric configuration. Thus, an adjoint model for a Held-Suarez type configuration representing a basis atmospheric dynmics setup for the Atmospheric Model Intercomparison Project (AMIP) is available and is currently being used for sensitivity studies and entropy maximization studies.

Coupled ocean - sea ice model

A new sea ice model has recently been implemented {men-zha:03}, closing an important gap for global ocean modeling and state estimation, enabling arctic modeling studies, and assimilation of large number hitherto unexplored sea ice date to be assimilated into the MITgcm.  The development of the adjoint of the sea ice model has greatly benefited from the availability of the adjoint of a bulk formula package based on the \cite{lar-pon:82} implementation to compute air-sea fluxes from atmospheric fields. The existing bulk formula implementation provided a natural link to the sea ice model and its adjoint.