Web-Books
in the Austria-Forum
Austria-Forum
Web-Books
Naturwissenschaften
Physik
Differential Geometrical Theory of Statistics
Page - 297 -
  • User
  • Version
    • full version
    • text only version
  • Language
    • Deutsch - German
    • English

Page - 297 - in Differential Geometrical Theory of Statistics

Image of the Page - 297 -

Image of the Page - 297 - in Differential Geometrical Theory of Statistics

Text of the Page - 297 -

Entropy2016,18, 442 SeeAppendixBforaclosed-formformulawhendealingwithexponential familycomponents. 4.Boundingtheα-Divergence Theα-divergence[15,32–34]betweenm(x)=āˆ‘ki=1wipi(x)andm ′(x)=āˆ‘k ′ i=1w ′ ip ′ i(x) isdefinedas Dα ( m :m′ ) = 1 α(1āˆ’Ī±) ( 1āˆ’ ∫ X m(x)αm′(x)1āˆ’Ī±dx ) , (29) which clearly satisfiesDα(m :m′) = D1āˆ’Ī±(m′ :m). The α-divergence is a family of information divergencesparametrizedbyα∈R\{0,1}. Letα→1,weget theKLdivergence (see [35] foraproof): lim α→1 Dα(m :m′)=KL(m :m′)= ∫ X m(x) log m(x) m′(x)dx, (30) andα→0gives thereverseKLdivergence: lim α→0 Dα(m :m′)=KL(m′ :m). Other interesting values [33] include α = 1/2 (squaredHellinger distance), α = 2 (Pearson Chi-squaredistance),α=āˆ’1(NeymanChi-squaredistance), etc.Notably, theHellingerdistance isa validdistancemetricwhichsatisfiesnon-negativity, symmetry,andthe triangle inequality. Ingeneral, Dα(m : m′) only satisfies non-negativity so that Dα(m :m′) ≄ 0 for anym(x) andm′(x). It is neithersymmetricnoradmitting the triangle inequality.Minimizationofα-divergencesallowsoneto choosea trade-offbetweenmodefittingandsupportfittingof theminimizer [36]. Theminimizerof α-divergences includingMLEasaspecialcasehas interestingconnectionswithtranscendentalnumber theory[37]. TocomputeDα(m :m′) forgivenm(x)andm′(x) reducestoevaluatetheHellinger integral [38,39]: Hα(m :m′)= ∫ X m(x)αm′(x)1āˆ’Ī±dx, (31) which in general does not have a closed form, as itwas known that the α-divergence ofmixture models isnotanalytic [6].Moreover,Hα(m :m′)maydivergemakingtheα-divergenceunbounded. OnceHα(m :m′)canbesolved, theRĆ©nyiandTsallisdivergences [35]andingeneralSharma–Mittal divergences [40]canbeeasilycomputed. Therefore theresultspresentedheredirectlyextendto those divergence families. Similar to thecaseofKLdivergence, theMonteCarlostochasticestimationofHα(m :m′)canbe computedeitheras Hˆnα ( m :m′ ) = 1 n n āˆ‘ i=1 ( m′(xi) m(xi) )1āˆ’Ī± , wherex1,. . . ,xn∼m(x)are i.i.d. samples,oras Hˆnα ( m :m′ ) = 1 n n āˆ‘ i=1 ( m(xi) m′(xi) )α , wherex1,. . . ,xn∼m′(x)are i.i.d. Ineithercase, it isconsistentsothat limnā†’āˆž Hˆnα (m :m′)=Hα(m :m′). However, MC estimation requires a large sample and does not guarantee deterministic bounds. The techniques described in [41]work in practice for very close distributions, and do not apply betweenmixturemodels.Wewill thereforederivecombinatorialboundsforHα(m :m′).Thestructure of thisSectionisparallelwithSection2withnecessaryreformulationsforaclearpresentation. 297
back to the  book Differential Geometrical Theory of Statistics"
Differential Geometrical Theory of Statistics
Title
Differential Geometrical Theory of Statistics
Authors
FrƩdƩric Barbaresco
Frank Nielsen
Editor
MDPI
Location
Basel
Date
2017
Language
English
License
CC BY-NC-ND 4.0
ISBN
978-3-03842-425-3
Size
17.0 x 24.4 cm
Pages
476
Keywords
Entropy, Coding Theory, Maximum entropy, Information geometry, Computational Information Geometry, Hessian Geometry, Divergence Geometry, Information topology, Cohomology, Shape Space, Statistical physics, Thermodynamics
Categories
Naturwissenschaften Physik
Web-Books
Library
Privacy
Imprint
Austria-Forum
Austria-Forum
Web-Books
Differential Geometrical Theory of Statistics