Page - 296 - in Differential Geometrical Theory of Statistics

Image of the Page - 296 -

Text of the Page - 296 -

Entropy2016,18, 442 Without loss of generality, consider GMMs in the formm(x) = ∑ki=1wip(x;μi,Σi) (Σi=σ 2 i for univariate Gaussians). The mean μ¯ of the mixture is μ¯=∑ki=1wiμi and the variance is σ¯2=E[m2]−E[m]2. SinceE[m2]=∑ki=1wi ∫ x2p(x;μi,Σi)dx=∑ki=1wi ( μ2i+σ 2 i ) ,wededuce that σ¯2= k ∑ i=1 wi(μ2i+σ 2 i )− ( k ∑ i=1 wiμi )2 = k ∑ i=1 wi [ (μi− μ¯)2+σ2i ] . The entropyof a randomvariablewith aprescribedvariance σ¯2 ismaximal for theGaussian distribution with the same variance σ¯2, see [4]. Since the differential entropy of a Gaussian is log(σ¯ √ 2πe),wededuce that theentropyof theGMMisupperboundedby H(m)≤ 1 2 log(2πe)+ 1 2 log k ∑ i=1 wi [ (μi− μ¯)2+σ2i ] . Thisupperboundcanbeeasilygeneralizedtoarbitrarydimensionality.Weget thefollowinglemma: Lemma 2. The entropy of a d-variate GMM m(x) = ∑ki=1wip(x;μi,Σi) is upper bounded by d 2 log(2πe)+ 1 2 logdetΣ,whereΣ=∑ k i=1wi(μiμ i +Σi)− ( ∑ki=1wiμi )( ∑ki=1wiμ i ) . Ingeneral, exponential familieshaveﬁnitemomentsof anyorder [17]: Inparticular,wehave E[t(X)]=∇F(θ)andV[t(X)]=∇2F(θ). ForGaussiandistribution,wehavethesufﬁcientstatistics t(x)= (x,x2) so thatE[t(X)] =∇F(θ)yields themeanandvariance fromthe log-normalizer. It is easy togeneralizeLemma2tomixturesofexponential familydistributions. Note that thisbound(calledtheMaximumEntropyUpperBoundin[13],MEUB) is tightwhen theGMMapproximatesasingleGaussian. It is fast tocomputecomparedto theboundreported in [9] thatusesTaylor’ sexpansionof the log-sumof themixturedensity. Asimilar argument cannotbeapplied for a lowerboundsinceaGMMwithagivenvariance mayhaveentropytendingto−∞. Forexample,assumethe2-componentmixture’smean iszero,and that thevarianceapproximates1by takingm(x)= 12G(x;−1, )+ 12G(x;1, )whereGdenotes the Gaussiandensity. Letting →0,weget theentropytendingto−∞. Weremarkthatour log-sum-expinequality techniqueyieldsa log2additiveapproximationrange in thecaseofaGaussianmixturewith twocomponents. It thusgeneralizes theboundsreported in [7] toGMMswitharbitraryvariances thatarenotnecessarilyequal. Tosee theboundgap,wehave −∑ r ∫ Ir m(x) ( logk+ logmax i wipi(x) ) dx≤H(m) ≤−∑ r ∫ Ir m(x)max { logmax i wipi(x), logk+ logmin i wipi(x) } dx. (27) Therefore thegapisatmost Δ=min { ∑ r ∫ Ir m(x) log maxiwipi(x) miniwipi(x) dx, logk } =min { ∑ s ∑ r ∫ Ir wsps(x) log maxiwipi(x) miniwipi(x) dx, logk } . (28) Thustocompute thegaperrorboundof thedifferentialentropy,weneedto integrate termsin the form ∫ wapa(x) log wbpb(x) wcpc(x) dx. 296

back to the book Differential Geometrical Theory of Statistics"

Differential Geometrical Theory of Statistics

Title: Differential Geometrical Theory of Statistics
Authors: Frédéric Barbaresco; Frank Nielsen
Editor: MDPI
Location: Basel
Date: 2017
Language: English
License: CC BY-NC-ND 4.0
ISBN: 978-3-03842-425-3
Size: 17.0 x 24.4 cm
Pages: 476
Keywords: Entropy, Coding Theory, Maximum entropy, Information geometry, Computational Information Geometry, Hessian Geometry, Divergence Geometry, Information topology, Cohomology, Shape Space, Statistical physics, Thermodynamics
Categories: Naturwissenschaften Physik