Web-Books
in the Austria-Forum
Austria-Forum
Web-Books
Naturwissenschaften
Physik
Differential Geometrical Theory of Statistics
Page - 309 -
  • User
  • Version
    • full version
    • text only version
  • Language
    • Deutsch - German
    • English

Page - 309 - in Differential Geometrical Theory of Statistics

Image of the Page - 309 -

Image of the Page - 309 - in Differential Geometrical Theory of Statistics

Text of the Page - 309 -

Entropy2016,18, 442 Werequire that the Jeffreysdivergencebetweenmixturesbefinite inorder toapproximate theKL betweenmixturesbyaBregmandivergence.Welooselyderive thisobservation(Carefulderivations willbereportedelsewhere)usingtwodifferentapproaches: • First, continuous mixture distributions have smooth densities that can be arbitrarily closely approximatedusingasingledistribution (potentiallymulti-modal)belonging to thePolynomial ExponentialFamilies [53,54] (PEFs).Apolynomialexponential familyoforderDhaslog-likelihood l(x;θ)∝∑Di=1θix i: Therefore, aPEFisanexponential familywithpolynomial sufficientstatistics t(x) = (x,x2,. . . ,xD).However, the log-normalizerFD(θ) = log ∫ exp(θ t(x))dxof aD-order PEF is not available in closed-form: It is computationally intractable. Nevertheless, theKL betweentwomixturesm(x)andm′(x)canbe theoreticallyapproximatedcloselybyaBregman divergencebetween the twocorrespondingPEFs: KL(m(x) :m′(x)) KL(p(x;θ) : p(x;θ′))= BFD(θ ′:θ),whereθandθ′are thenaturalparametersof thePEFfamily{p(x;θ)}approximating m(x)andm′(x), respectively (i.e.,m(x) p(x;θ)andm′(x) p(x;θ′)).Notice that theBregman divergenceofPEFshasnecessarilyfinitevaluebuttheKLoftwosmoothmixturescanpotentially diverge (infinitevalue),hence theconditionsonJeffreysdivergence tobefinite. • Second, consider twofinitemixturesm(x) =∑ki=1wipi(x) andm′(x) =∑k ′ j=1w ′ jp ′ j(x)of k and k′ components (possibly with heterogeneous components pi(x)’s and p′j(x)’s), respectively. In informationgeometry,amixture family is thesetofconvexcombinationoffixedcomponent densities. Thus in statistics, amixture is understoodas a convex combinationofparametric componentswhile in informationgeometryamixture family is thesetofconvexcombination of fixed components. Let us consider themixture families {g(x;(w,w′))} generated by the D= k+k′fixedcomponents p1(x), . . . ,pk(x),p′1(x), . . . ,p ′ k′(x):{ g(x;(w,w′))= k ∑ i=1 wipi(x)+ k′ ∑ j=1 w′jp ′ j(x) : k ∑ i=1 wi+ k′ ∑ j=1 w′j=1 } Wecanapproximatearbitrarilyfinely (withrespect to totalvariation)mixturem(x) forany >0 by g(x;α) (1− )m(x)+ m′(x)withα= ((1− )w, w′) (so that∑k+k′i=1 αi= 1) andm′(x) g(x;α′)= m(x)+(1− )m′(x)withα′=( w,(1− )w′) (and∑k+k′i=1 α′i=1). ThereforeKL(m(x) : m′(x)) KL(g(x;α) : g(x;α′))=BF∗(α :α′),whereF∗(α)= ∫ g(x;α)logg(x;α)dx is theShannon information(negativeShannonentropy) for thecompositemixture family.Again, theBregman divergence BF∗(α : α′) is necessarily finite but KL(m(x) : m′(x)) betweenmixturesmay be potentially infinitewhentheKLintegraldiverges (hence, theconditiononJeffreysdivergence finiteness). Interestingly, thisShannoninformationcanbearbitrarilycloselyapproximatedwhen considering isotropicGaussians [13].Notice that theconvexconjugateF(θ)of thecontinuous Shannonneg-entropyF∗(η) is the log-sum-expfunctiononthe inversesoftmap. References 1. Huang, Z.K.; Chau, K.W. A new image thresholding method based on Gaussian mixture model. Appl.Math. Comput.2008,205, 899–907. 2. Seabra, J.; Ciompi, F.; Pujol, O.;Mauri, J.; Radeva, P.; Sanches, J. Rayleighmixturemodel for plaque characterization in intravascularultrasound. IEEETrans. Biomed. Eng. 2011,58, 1314–1324. 3. Julier,S.J.;Bailey,T.;Uhlmann, J.K. UsingExponentialMixtureModels forSuboptimalDistributedData Fusion. InProceedingsof the2006IEEENonlinearStatisticalSignalProcessingWorkshop,Cambridge, UK,13–15September2006; IEEE:NewYork,NY,USA,2006;pp. 160–163. 4. Cover,T.M.;Thomas, J.A.Elementsof InformationTheory; JohnWiley&Sons:Hoboken,NJ,USA,2012. 5. Banerjee,A.;Merugu,S.;Dhillon, I.S.;Ghosh, J. ClusteringwithBregmandivergences. J.Mach. Learn. Res. 2005,6, 1705–1749. 309
back to the  book Differential Geometrical Theory of Statistics"
Differential Geometrical Theory of Statistics
Title
Differential Geometrical Theory of Statistics
Authors
Frédéric Barbaresco
Frank Nielsen
Editor
MDPI
Location
Basel
Date
2017
Language
English
License
CC BY-NC-ND 4.0
ISBN
978-3-03842-425-3
Size
17.0 x 24.4 cm
Pages
476
Keywords
Entropy, Coding Theory, Maximum entropy, Information geometry, Computational Information Geometry, Hessian Geometry, Divergence Geometry, Information topology, Cohomology, Shape Space, Statistical physics, Thermodynamics
Categories
Naturwissenschaften Physik
Web-Books
Library
Privacy
Imprint
Austria-Forum
Austria-Forum
Web-Books
Differential Geometrical Theory of Statistics