Web-Books
in the Austria-Forum
Austria-Forum
Web-Books
Naturwissenschaften
Physik
Differential Geometrical Theory of Statistics
Page - 308 -
  • User
  • Version
    • full version
    • text only version
  • Language
    • Deutsch - German
    • English

Page - 308 - in Differential Geometrical Theory of Statistics

Image of the Page - 308 -

Image of the Page - 308 - in Differential Geometrical Theory of Statistics

Text of the Page - 308 -

Entropy2016,18, 442 withAk = ∫∞ x0 1√ 2π exp(−x−k22 )dx. When k→∞, we haveAk → 1. Consider k0 ∈ N such that Ak0 >0.9. Thentheradiusofconvergence r is suchthat: 1 r ≥ lim k→∞ ( 1 kk! 0.9exp ( k2 8 ))1 k =∞. Thus theconvergenceradius is r=0,andtherefore theKLdivergence isnotananalytic function of theparameterw. TheKLofmixtures isanexampleofanon-analytic smoothfunction. (Notice that theabsolutevalue isnotanalyticat0.) AppendixB. Closed-FormFormulafor theKullback–LeiblerDivergencebetweenScaledand TruncatedExponentialFamilies Whencomputingapproximationbounds for theKLdivergencebetweentwomixturesm(x)and m′(x),weendupwith the taskof computing ∫ Dwapa(x) log w′bp ′ b(x) w′cp′c(x) dxwhereD⊆X is a subsetof the full supportX . We report a generic formula for computing these formulawhen themixture (scaledandtruncated)componentsbelongto thesameexponential family [17].Anexponential family hascanonical log-densitywrittenas l(x;θ)= logp(x;θ)= θ t(x)−F(θ)+k(x),where t(x)denotes thesufficientstatistics,F(θ) the log-normalizer (alsocalledcumulant functionorpartitionfunction), andk(x)anauxiliarycarrier term. LetKL(w1p1 :w2p2 :w3p3)= ∫ Xw1p1(x) log w2p2(x) w3p3(x) dx=H×(w1p1 :w3p3)−H×(w1p1 :w2p2). Since it is adifferenceof twocross-entropies,weget for threedistributionsbelonging to the same exponential family [26] the followingformula: KL(w1p1 :w2p2 :w3p3)=w1 log w2 w3 +w1(F(θ3)−F(θ2)−(θ3−θ2) ∇F(θ1)). Furthermore, when the support is restricted, say to support range D ⊆ X , let mD(θ)= ∫ D p(x;θ)dxdenotethemassand ˜p(x;θ)= p(x;θ) mD(θ) thenormalizeddistribution. Thenwehave:∫ D w1p1(x) log w2p2(x) w3p3(x) dx=mD(θ1)(KL(w1p˜1 :w2p˜2 :w3p˜3))− logw2mD(θ3)w3mD(θ2). WhenFD(θ)=F(θ)−logmD(θ) isstrictlyconvexanddifferentiablethen ˜p(x;θ) isanexponential familyandtheclosed-formformulafollowsstraightforwardly.Otherwise,westillgetaclosed-formbut needmorederivations. Forunivariatedistributions,wewriteD=(a,b)andmD(θ)= ∫ b a p(x;θ)dx= Pθ(b)−Pθ(a)wherePθ(a)= ∫ a p(x;θ)dxdenotes thecumulativedistributionfunction. Theusual formula for truncatedandscaledKullback–Leiblerdivergence is: KLD(wp(x;θ) :w′p(x;θ′))=wmD(θ) ( log w w′+BF(θ ′ : θ) ) +w(θ′−θ) ∇mD(θ), (B1) whereBF(θ′ : θ) isaBregmandivergence [5]: BF(θ′ : θ)=F(θ′)−F(θ)−(θ′−θ) ∇F(θ). This formula extends the classic formula [5] for full regular exponential families (by setting w=w′=1andmD(θ)=1with∇mD(θ)=0). Similar formulæareavailable for thecross-entropyandentropyofexponential families [26]. AppendixC. OntheApproximationofKLbetweenSmoothMixturesbyaBregman Divergence [5] Clearly, sinceBregmandivergencesarealwaysfinitewhileKLdivergencesmaydiverge,weneed extraconditions toassert that theKLbetweenmixturescanbeapproximatedbyBregmandivergences. 308
back to the  book Differential Geometrical Theory of Statistics"
Differential Geometrical Theory of Statistics
Title
Differential Geometrical Theory of Statistics
Authors
Frédéric Barbaresco
Frank Nielsen
Editor
MDPI
Location
Basel
Date
2017
Language
English
License
CC BY-NC-ND 4.0
ISBN
978-3-03842-425-3
Size
17.0 x 24.4 cm
Pages
476
Keywords
Entropy, Coding Theory, Maximum entropy, Information geometry, Computational Information Geometry, Hessian Geometry, Divergence Geometry, Information topology, Cohomology, Shape Space, Statistical physics, Thermodynamics
Categories
Naturwissenschaften Physik
Web-Books
Library
Privacy
Imprint
Austria-Forum
Austria-Forum
Web-Books
Differential Geometrical Theory of Statistics