Page - 301 - in Differential Geometrical Theory of Statistics

Image of the Page - 301 -

Text of the Page - 301 -

Entropy2016,18, 442 LetuspartitionthesupportX=unionmulti r=1Irarbitrarilyintoelementaryranges,whichdonotnecessarily correspond to the envelopes. Denote by MI the probabilitymass of amixturem(x) in the range I: MI= ∫ Im(x)dx. Then Df(m :m′)= ∑ r=1 MIr ∫ Ir m(x) MIr f ( m′(x) m(x) ) dx. Note that in range Ir, m(x) MIr is a unit weight distribution. Thus by Jensen’s inequality f(E[X])≤E[f(X)],weget Df(m :m′)≥ ∑ r=1 MIr f (∫ Ir m(x) MIr m′(x) m(x) dx ) = ∑ r=1 MIr f ( M′Ir MIr ) . (52) Notice that the RHS of Equation (52) is the f-divergence between (MI1, · · · ,MI ) and (M′I1, · · · ,M′I ), denoted byDIf (m : m′). In the special case that = 1 and I1 = X , the above Equation(52) turnsout tobe theusualGibbs’ inequality:Df(m :m′)≥ f(1), andCsiszárgenerator is chosenso that f(1)= 0. In conclusion, foraﬁxed (coarse-grained) countablepartitionofX ,we recover thewell-knowinformationmonotonicity [46]of the f-divergences: Df(m :m′)≥DIf (m :m′)≥0. Inpractice,wegetclosed-formlowerboundswhenMI= ∫ b a m(x)dx=Φ(b)−Φ(a) isavailable inclosed-form,whereΦ(·)denote theCDF. Inparticular, ifm(x) isamixturemodel, then itsCDFcan becomputedbylinearlycombiningtheCDFsof its components. Towrapup,wehaveproved that coarse-grainingbymakingaﬁnitepartitionof the support X yieldsa lowerboundon the f-divergencebyvirtueof the informationmonotonicity. Therefore, insteadofdoingMonteCarlostochastic integration: Dˆnf(m :m ′)= 1 n n ∑ i=1 f ( m′(xi) m(xi) ) , with x1, . . . ,xn ∼i.i.d. m(x), it could be better to sort those n samples and consider the coarse-grainedpartition: I=(−∞,x(1)]∪ ( unionmultin−1i=1 (x(i),x(i+1)] ) ∪(x(n),+∞) togetaguaranteed lowerboundonthe f-divergence.Wewillcall thisboundCGQLBforCoarseGraining QuantizationLowerBound. Givenabudgetofn splittingpointson the rangeX , itwouldbe interesting toﬁnd thebestn points thatmaximize the lowerboundDIf (m :m ′). This isongoingresearch. 6. Experiments Weperformanempiricalstudytoverifyourtheoreticalbounds.Wesimulatefourpairsofmixture models{(EMM1,EMM2),(RMM1,RMM2),(GMM1,GMM2),(GaMM1,GaMM2)}as the test subjects. Thecomponent type is impliedbythemodelname,whereGaMMstands forGammamixtures. Thecomponentsofeach mixturemodelaregivenas follows. 1. EMM1’s components, in the form (λi,wi), are given by (0.1,1/3), (0.5,1/3), (1,1/3); EMM2’s componentsare (2,0.2), (10,0.4), (20,0.4). 2. RMM1’s components, in the form (σi,wi), aregivenby (0.5,1/3), (2,1/3), (10,1/3);RMM2 consists of (5,0.25), (60,0.25), (100,0.5). 301

back to the book Differential Geometrical Theory of Statistics"

Differential Geometrical Theory of Statistics

Title: Differential Geometrical Theory of Statistics
Authors: Frédéric Barbaresco; Frank Nielsen
Editor: MDPI
Location: Basel
Date: 2017
Language: English
License: CC BY-NC-ND 4.0
ISBN: 978-3-03842-425-3
Size: 17.0 x 24.4 cm
Pages: 476
Keywords: Entropy, Coding Theory, Maximum entropy, Information geometry, Computational Information Geometry, Hessian Geometry, Divergence Geometry, Information topology, Cohomology, Shape Space, Statistical physics, Thermodynamics
Categories: Naturwissenschaften Physik