Divergence Function, Duality, and Convex Analysis

Abstract

From a smooth, strictly convex function Φ: Rn → R, a parametric family of divergence function DΦ(α) may be introduced: [Formula: see text] for x, y, ε int dom(Φ) and for α ε R, with DΦ(±1 defined through taking the limit of α. Each member is shown to induce an α-independent Riemannian metric, as well as a pair of dual α-connections, which are generally nonflat, except for α = ±1. In the latter case, D(±1)Φ reduces to the (nonparametric) Bregman divergence, which is representable using and its convex conjugate Φ * and becomes the canonical divergence for dually flat spaces (Amari, 1982, 1985; Amari & Nagaoka, 2000). This formulation based on convex analysis naturally extends the information-geometric interpretation of divergence functions (Eguchi, 1983) to allow the distinction between two different kinds of duality: referential duality (α -α) and representational duality (Φ  Φ *). When applied to (not necessarily normalized) probability densities, the concept of conjugated representations of densities is introduced, so that ± α-connections defined on probability densities embody both referential and representational duality and are hence themselves bidual. When restricted to a finite-dimensional affine submanifold, the natural parameters of a certain representation of densities and the expectation parameters under its conjugate representation form biorthogonal coordinates. The alpha representation (indexed by β now, β ε [−1, 1]) is shown to be the only measure-invariant representation. The resulting two-parameter family of divergence functionals D(α, β), (α, β) ε [−1, 1] × [-1, 1] induces identical Fisher information but bidual alpha-connection pairs; it reduces in form to Amari's alpha-divergence family when α =±1 or when β = 1, but to the family of Jensen difference (Rao, 1987) when β = 1.

Cite

Text

Zhang. "Divergence Function, Duality, and Convex Analysis." Neural Computation, 2004. doi:10.1162/08997660460734047

Markdown

[Zhang. "Divergence Function, Duality, and Convex Analysis." Neural Computation, 2004.](https://mlanthology.org/neco/2004/zhang2004neco-divergence/) doi:10.1162/08997660460734047

BibTeX

@article{zhang2004neco-divergence,
  title     = {{Divergence Function, Duality, and Convex Analysis}},
  author    = {Zhang, Jun},
  journal   = {Neural Computation},
  year      = {2004},
  pages     = {159-195},
  doi       = {10.1162/08997660460734047},
  volume    = {16},
  url       = {https://mlanthology.org/neco/2004/zhang2004neco-divergence/}
}