Page - 301 - in Differential Geometrical Theory of Statistics
Image of the Page - 301 -
Text of the Page - 301 -
Entropy2016,18, 442
LetuspartitionthesupportX=unionmulti r=1Irarbitrarilyintoelementaryranges,whichdonotnecessarily
correspond to the envelopes. Denote by MI the probabilitymass of amixturem(x) in the range I:
MI= ∫
Im(x)dx. Then
Df(m :m′)= ∑
r=1 MIr ∫
Ir m(x)
MIr f ( m′(x)
m(x) )
dx.
Note that in range Ir, m(x)
MIr is a unit weight distribution. Thus by Jensen’s inequality
f(E[X])≤E[f(X)],weget
Df(m :m′)≥ ∑
r=1 MIr f (∫
Ir m(x)
MIr m′(x)
m(x) dx )
= ∑
r=1 MIr f ( M′Ir
MIr )
. (52)
Notice that the RHS of Equation (52) is the f-divergence between (MI1, · · · ,MI ) and
(M′I1, · · · ,M′I ), denoted byDIf (m : m′). In the special case that = 1 and I1 = X , the above
Equation(52) turnsout tobe theusualGibbs’ inequality:Df(m :m′)≥ f(1), andCsiszárgenerator
is chosenso that f(1)= 0. In conclusion, forafixed (coarse-grained) countablepartitionofX ,we
recover thewell-knowinformationmonotonicity [46]of the f-divergences:
Df(m :m′)≥DIf (m :m′)≥0.
Inpractice,wegetclosed-formlowerboundswhenMI= ∫ b
a m(x)dx=Φ(b)−Φ(a) isavailable
inclosed-form,whereΦ(·)denote theCDF. Inparticular, ifm(x) isamixturemodel, then itsCDFcan
becomputedbylinearlycombiningtheCDFsof its components.
Towrapup,wehaveproved that coarse-grainingbymakingafinitepartitionof the support
X yieldsa lowerboundon the f-divergencebyvirtueof the informationmonotonicity. Therefore,
insteadofdoingMonteCarlostochastic integration:
Dˆnf(m :m ′)= 1
n n
∑
i=1 f ( m′(xi)
m(xi) )
,
with x1, . . . ,xn ∼i.i.d. m(x), it could be better to sort those n samples and consider the
coarse-grainedpartition:
I=(−∞,x(1)]∪ (
unionmultin−1i=1 (x(i),x(i+1)] )
∪(x(n),+∞)
togetaguaranteed lowerboundonthe f-divergence.Wewillcall thisboundCGQLBforCoarseGraining
QuantizationLowerBound.
Givenabudgetofn splittingpointson the rangeX , itwouldbe interesting tofind thebestn
points thatmaximize the lowerboundDIf (m :m ′). This isongoingresearch.
6. Experiments
Weperformanempiricalstudytoverifyourtheoreticalbounds.Wesimulatefourpairsofmixture
models{(EMM1,EMM2),(RMM1,RMM2),(GMM1,GMM2),(GaMM1,GaMM2)}as the test subjects. Thecomponent
type is impliedbythemodelname,whereGaMMstands forGammamixtures. Thecomponentsofeach
mixturemodelaregivenas follows.
1. EMM1’s components, in the form (λi,wi), are given by (0.1,1/3), (0.5,1/3), (1,1/3); EMM2’s
componentsare (2,0.2), (10,0.4), (20,0.4).
2. RMM1’s components, in the form (σi,wi), aregivenby (0.5,1/3), (2,1/3), (10,1/3);RMM2 consists
of (5,0.25), (60,0.25), (100,0.5).
301
Differential Geometrical Theory of Statistics
- Title
- Differential Geometrical Theory of Statistics
- Authors
- Frédéric Barbaresco
- Frank Nielsen
- Editor
- MDPI
- Location
- Basel
- Date
- 2017
- Language
- English
- License
- CC BY-NC-ND 4.0
- ISBN
- 978-3-03842-425-3
- Size
- 17.0 x 24.4 cm
- Pages
- 476
- Keywords
- Entropy, Coding Theory, Maximum entropy, Information geometry, Computational Information Geometry, Hessian Geometry, Divergence Geometry, Information topology, Cohomology, Shape Space, Statistical physics, Thermodynamics
- Categories
- Naturwissenschaften Physik