Page - 291 - in Differential Geometrical Theory of Statistics

Image of the Page - 291 -

Text of the Page - 291 -

Entropy2016,18, 442 − log(w′jp′j(x)) w′jp′j(x) Figure1.Lowerenvelopeofparabolascorrespondingto theupperenvelopeofweightedcomponents ofaGaussianmixturewithk′=3components. Toproceedoncetheenvelopeshavebeenbuilt,weneedtocalculatetwotypesofdefinite integrals onthoseelementaryintervals: (i) theprobabilitymassinaninterval ∫ b a p(x)dx=Φ(b)−Φ(a)whereΦ denotes the Cumulative Distribution Function (CDF); and (ii) the partial cross-entropy −∫ ba p(x) logp′(x)dx [26]. Thus letusdeﬁnethese twoquantities: Ci,j(a,b) = − ∫ b a wipi(x) log(w′jp ′ j(x))dx, (9) Mi(a,b) = − ∫ b a wipi(x)dx. (10) ByEquations (7)and(8),weget theboundsofH×(m :m′)as L×(m :m′)= ∑ r=1 k ∑ s=1 Cs,δ(r)(ar,ar+1)− logk′, U×(m :m′)= ∑ r=1 k ∑ s=1 min { Cs,δ(r)(ar,ar+1), Cs, (r)(ar,ar+1)−Ms(ar,ar+1) logk′ } . (11) Thesizeof the lower/upperboundformuladependsontheenvelopecomplexity , the number kofmixturecomponents,andtheclosed-formexpressionsof the integral termsCi,j(a,b)andMi(a,b). Ingeneral,whenapairofweightedcomponentdensities intersect inatmost ppoints, theenvelope complexity is related to the Davenport–Schinzel sequences [27]. It is quasi-linear for bounded p=O(1), see [27]. Note that insymbolic computing, theRischsemi-algorithm[28] solves theproblemofcomputing indeﬁnite integration in termsofelementary functionsprovidedthat thereexistsanoracle (hence the term“semi-algorithm”) forcheckingwhetheranexpression isequivalent tozeroornot (however it is unknownwhether thereexistsanalgorithmimplementingtheoracleornot). Wepresentedthetechniquebyboundingthecross-entropy(andentropy) todeliver lower/upper boundsontheKLdivergence.WhenonlytheKLdivergenceneedstobebounded,weratherconsider theratioterm m(x)m′(x). ThisrequirestopartitionthesupportX intoelementaryintervalsbyoverlayingthe criticalpointsofboththelowerandupperenvelopesofm(x)andm′(x),whichcanbedoneinlinear time. Inagivenelementaryinterval, sincemax{kmini{wipi(x)},maxi{wipi(x)}}≤m(x)≤ kmaxi{wipi(x)}, wethenconsider the inequalities: max{kmini{wipi(x)},maxi{wipi(x)}} k′maxj{w′jp′j(x)} ≤ m(x) m′(x)≤ kmaxi{wipi(x)} max{k′minj{w′jp′j(x)},maxj{w′jp′j(x)}} . (12) We now need to compute deﬁnite integrals of the form ∫b a w1p(x;θ1)log w2p(x;θ2) w3p(x;θ3) dx (see AppendixBforexplicit formulaswhenconsideringscaledandtruncatedexponential families [17]). (Thus forexponential families, theratioofdensities removes theauxiliarycarriermeasure term.) 291

back to the book Differential Geometrical Theory of Statistics"

Differential Geometrical Theory of Statistics

Title: Differential Geometrical Theory of Statistics
Authors: Frédéric Barbaresco; Frank Nielsen
Editor: MDPI
Location: Basel
Date: 2017
Language: English
License: CC BY-NC-ND 4.0
ISBN: 978-3-03842-425-3
Size: 17.0 x 24.4 cm
Pages: 476
Keywords: Entropy, Coding Theory, Maximum entropy, Information geometry, Computational Information Geometry, Hessian Geometry, Divergence Geometry, Information topology, Cohomology, Shape Space, Statistical physics, Thermodynamics
Categories: Naturwissenschaften Physik