Page - 307 - in Differential Geometrical Theory of Statistics
Image of the Page - 307 -
Text of the Page - 307 -
Entropy2016,18, 442
(anadditivelyweightedBregman–Voronoidiagram[49,50] for componentsbelonging to the same
exponential family). However, it becomesmore complex to compute in the elementaryVoronoi
cellsV, the functionsCi,j(V)andMi(V) (in1D, theVoronoicellsaresegments).Wemayobtainhybrid
algorithmsbyapproximatingorestimatingthese functions. In2D, it is thuspossible toobtain lower
andupperboundson theMutual Information [51] (MI)when the jointdistributionm(x,y) is a 2D
mixtureofGaussians:
I(M;M′)= ∫
m(x,y) log m(x,y)
m(x)m′(y)dxdy.
Indeed, themarginaldistributionsm(x)andm′(y)areunivariateGaussianmixtures.
APythoncode implementing thosecomputational-geometricmethods for reproducible research
isavailableonline [52].
Acknowledgments: Theauthorsgratefully thank the referees for their comments. Thisworkwascarriedout
whileKeSunwasvisitingFrankNielsenatEcolePolytechnique,Palaiseau,France.
AuthorContributions:FrankNielsenandKeSuncontributedto the theoretical resultsaswellas to thewritingof
thearticle.KeSunimplementedthemethodsandperformedthenumericalexperiments.Allauthorshaveread
andapprovedthefinalmanuscript.
Conflictsof Interest:Theauthorsdeclarenoconflictof interest.
AppendixA. TheKullback–LeiblerDivergenceofMixtureModelsIsNotAnalytic [6]
Ideally,weaimatgettingafinite lengthclosed-formformula tocompute theKLdivergenceof
finitemixturemodels.However, this isprovablymathematically intractable [6]becauseof the log-sum
terminthe integral, asweshallprovebelow.
Analytic expressions encompass closed-form formula and may include special functions
(e.g., Gamma function) but do not allow to use limits or integrals. An analytic function f(x)
is a C∞ function (infinitely differentiable) such that around any point x0 the k-order Taylor
series Tk(x)=∑ki=0 f(i)(x0)
i! (x−x0)i converges to f(x): limk→∞Tk(x) = f(x) for x belonging to a
neighborhood Nr(x0) = {x : |x−x0| ≤ r} of x0, where r is called the radius of convergence.
Theanalyticpropertyofa function isequivalent to thecondition that foreach k∈N, thereexistsa
constant c suchthat ∣∣∣dk
fdxk(x)∣∣∣≤ ck+1k!.
Toprove that theKLofmixtures isnotanalytic (hencedoesnotadmitaclosed-formformula),
we shall adapt the proof reported in [6] (in Japanese, we thank Professor Aoyagi for sending
us his paper [6]). We shall prove that KL(p : q) is not analytic for p(x) = G(x;0,1) and
q(x;w)=(1−w)G(x;0,1)+wG(x;1,1),wherew∈ (0,1), andG(x;μ,σ)= 1√
2πσ exp(−(x−μ)22σ2 ) is the
densityofaunivariateGaussianofmeanμandstandarddeviationσ. LetD(w)=KL(p(x) : q(x;w))
denote the KL divergence between these two mixtures (p has a single component and q has
twocomponents).
Wehave
log p(x)
q(x;w) = log exp (
−x22 )
(1−w)exp (
−x22 )
+wexp ( −(x−1)22 )=− log(1+w(ex−12−1)). (A1)
Therefore
dkD
dwk = (−1)k
k ∫
p(x)(ex− 1
2−1)dx.
Let x0 be the root of the equation ex− 1
2 −1= ex2 so that for x≥ x0, wehave ex−12 −1≥ ex2 .
It followsthat: ∣∣∣∣∣dkDdwk ∣∣∣∣∣≥ 1k ∫ ∞
x0 p(x)e kx
2 dx= 1
k e k2
8 Ak
307
Differential Geometrical Theory of Statistics
- Title
- Differential Geometrical Theory of Statistics
- Authors
- Frédéric Barbaresco
- Frank Nielsen
- Editor
- MDPI
- Location
- Basel
- Date
- 2017
- Language
- English
- License
- CC BY-NC-ND 4.0
- ISBN
- 978-3-03842-425-3
- Size
- 17.0 x 24.4 cm
- Pages
- 476
- Keywords
- Entropy, Coding Theory, Maximum entropy, Information geometry, Computational Information Geometry, Hessian Geometry, Divergence Geometry, Information topology, Cohomology, Shape Space, Statistical physics, Thermodynamics
- Categories
- Naturwissenschaften Physik