Seite - 306 - in Differential Geometrical Theory of Statistics

Bild der Seite - 306 -

Text der Seite - 306 -

Entropy2016,18, 442 For amorequantitative comparison, Table 1 shows the estimatedα-divergencebyMC,Basic, Adaptive,andVR.AsDα isdeﬁnedonR\{0,1}, theKLboundsCE(A)LBandCE(A)UBarepresented forα=0or1.Overall,wehavethe followingorderofgapsize: Basic>Adaptive>VR,andVRis recommendedingeneral forboundingα-divergences. Therearecertaincases that theupperVRbound is looser thanAdaptive. Inpracticeonecancompute the intersectionof theseboundsaswellas the trivialboundDα(m :m′)≥0 toget thebestestimation. Table1.TheestimatedDα anditsbounds. The95%conﬁdence interval is shownforMC. α MC(102) MC(103) MC(104) Basic Adaptive VR L U L U L U GMM1&GMM2 0 15.96±3.9 12.30±1.0 13.63±0.3 11.75 15.89 12.96 14.63 0.01 13.36±2.9 10.63±0.8 11.66±0.3 −700.50 11.73 −77.33 11.73 11.40 12.27 0.5 3.57±0.3 3.47±0.1 3.47±0.07 −0.60 3.42 3.01 3.42 3.17 3.51 0.99 40.04±7.7 37.22±2.3 38.58±0.8 −333.90 39.04 5.36 38.98 38.28 38.96 1 104.01±28 84.96±7.2 92.57±2.5 91.44 95.59 92.76 94.41 GMM3&GMM4 0 0.71±0.2 0.63±0.07 0.62±0.02 0.00 1.76 0.00 1.16 0.01 0.71±0.2 0.63±0.07 0.62±0.02 −179.13 7.63 −38.74 4.96 0.29 1.00 0.5 0.82±0.3 0.57±0.1 0.62±0.04 −5.23 0.93 −0.71 0.85 −0.18 1.19 0.99 0.79±0.3 0.76±0.1 0.80±0.03 −165.72 12.10 −59.76 9.11 0.37 1.28 1 0.80±0.3 0.77±0.1 0.81±0.03 0.00 1.82 0.31 1.40 NotethesimilaritybetweenKLinEquation(30)andtheexpressioninEquation(47).Wegivewithout a formalanalysis that:CEAL(U)Bisequivalent toVRat the limitα→0orα→1. Experimentallyas weslowlysetα→1,wecansee thatVRisconsistentwithCEAL(U)B. 7.ConcludingRemarksandPerspectives WehavepresentedafastversatilemethodtocomputeboundsontheKullback–Leiblerdivergence betweenmixturesbybuildingalgorithmic formulae. We reportedonour experiments forvarious mixturemodels in theexponential family. ForunivariateGMMs,wegetaguaranteedboundof theKL divergenceoftwomixturesmandm′withkandk′componentswithinanadditiveapproximationfactor of logk+ logk′ inO((k+k′) log(k+k′))-time. Therefore, the larger theKLdivergence, the better the boundwhen considering amultiplicative (1+α)-approximation factor, since α = logk+logk ′ KL(m:m′) . Theadaptiveboundsareguaranteedtoyieldbetterboundsat theexpenseofcomputingpotentially O ( k2+(k′)2 ) intersectionpointsofpairwiseweightedcomponents. Our techniquealsoyields theboundfor the Jeffreysdivergence (thesymmetrizedKLdivergence: J(m,m′)=KL(m :m′)+KL(m′ :m)) andthe Jensen–Shannondivergence [47] (JS): JS(m,m′)= 1 2 ( KL ( m : m+m′ 2 ) +KL ( m′ : m+m ′ 2 )) , since m+m ′ 2 is amixturemodelwith k+k ′ components. One advantage of this statistical distance is that it is symmetric, always boundedby log2, and its square root yields ametric distance [48]. The log-sum-expinequalitiesmayalsobeusedtocomputesomeRényidivergences [35]: Rα(m,p)= 1 α−1 log (∫ m(x)αp(x)1−α ) dx, when α is an integer, m(x) a mixture and p(x) a single (component) distribution. Getting fast guaranteedtightboundsonstatisticaldistancesbetweenmixturesopensmanyavenues. Forexample, wemayconsiderbuildinghierarchicalmixturemodelsbymergingiterativelytwomixturecomponents so that thosepairsofcomponentsarechosenso that theKLdistancebetweenthe fullmixtureandthe simpliﬁedmixture isminimized. In order to be useful, our technique is unfortunately limited to univariatemixtures: indeed, in higher dimensions, we can still compute themaximization diagramofweighted components 306

zurück zum Buch Differential Geometrical Theory of Statistics"

Differential Geometrical Theory of Statistics

Titel: Differential Geometrical Theory of Statistics
Autoren: Frédéric Barbaresco; Frank Nielsen
Herausgeber: MDPI
Ort: Basel
Datum: 2017
Sprache: englisch
Lizenz: CC BY-NC-ND 4.0
ISBN: 978-3-03842-425-3
Abmessungen: 17.0 x 24.4 cm
Seiten: 476
Schlagwörter: Entropy, Coding Theory, Maximum entropy, Information geometry, Computational Information Geometry, Hessian Geometry, Divergence Geometry, Information topology, Cohomology, Shape Space, Statistical physics, Thermodynamics
Kategorien: Naturwissenschaften Physik