Page - 302 - in Differential Geometrical Theory of Statistics

Image of the Page - 302 -

Text of the Page - 302 -

Entropy2016,18, 442 3. GMM1’scomponents, intheform(μi,σi,wi),are(−5,1,0.05),(−2,0.5,0.1),(5,0.3,0.2),(10,0.5,0.2), (15,0.4,0.05), (25,0.5,0.3), (30,2,0.1);GMM2 consistsof (−16,0.5,0.1), (−12,0.2,0.1), (−8,0.5,0.1), (−4,0.2,0.1), (0,0.5,0.2), (4,0.2,0.1), (8,0.5,0.1), (12,0.2,0.1), (16,0.5,0.1). 4. GaMM1’scomponents, in theform (ki,λi,wi), are (2,0.5,1/3), (2,2,1/3), (2,4,1/3);GaMM2 consists of (2,5,1/3), (2,8,1/3), (2,10,1/3). Wecompare theproposedboundswithMonteCarloestimationwithdifferent samplesizes in the range{102,103,104,105}. Foreachsamplesizeconﬁguration,wereport the0.95conﬁdence intervalby MonteCarloestimationusing thecorrespondingnumberof samples. Figure2a–dshowsthe input signalsaswellas theestimationresults,where theproposedboundsCELB,CEUB,CEALB,CEAUB, CGQLBarepresented ashorizontal lines, and theMonteCarlo estimations overdifferent sample sizesarepresentedaserrorbars.Wecanlooselyconsider theaverageMonteCarlooutputwith the largest samplesize (105)as theunderlyingtruth,which isclearly insideourbounds. Thisservesasan empirical justiﬁcationonthecorrectnessof thebounds. A key observation is that the bounds can be very tight, especially when the underlying KL divergencehasa largemagnitude,e.g.,KL(RMM2 :RMM1). This isbecause thegapbetweenthe lower andupperbounds is alwaysguaranteed tobewithin logk+ logk′. BecauseKL isunbounded [4], in thegeneral case twomixturemodelsmayhavea largeKL.Thenourapproximationgapis relatively verysmall.Ontheotherhand,wealsoobservedthat thebounds incertaincases, e.g.,KL(EMM2 :EMM1), arenotas tightas theothercases.WhentheunderlyingKLissmall, theboundsarenotas informative as thegeneral case. Comparatively, there isasigniﬁcant improvementof theshape-dependentbounds(CEALBand CEAUB)over the combinatorial bounds (CELBandCEUB). In all investigatedcases, theadaptive boundscanroughlyshrinkthegapbyhalfof itsoriginal sizeat thecostofadditionalcomputation. Note that, thebounds are accurate andmust contain the truevalue. MonteCarlo estimation givesnoguaranteeonwhere the truevalue is. Forexample, inestimatingKL(GMM1 : GMM2),Monte Carloestimationbasedon104 samplescangobeyondourbounds! It thereforesuffers fromalarger estimationerror. CGQLBasasimple-to-implement techniqueshowssurprisinggoodperformance inseveral cases, e.g.,KL(RMM1,RMM2).Althoughit requiresa largenumberofsamples,wecanobserve that increasing samplesizehas limitedeffecton improvingthisbound.Therefore, inpractice,onemayintersect the rangedeﬁnedbyCEALBandCEAUBwith the rangedeﬁnedbyCGQLBwithasmall sample size (e.g., 100) togetbetterbounds. WesimulatesasetofGaussianmixturemodelsbesides theaboveGMM1 andGMM2. Figure3shows theGMMdensitiesaswellas theirdifferentialentropy.Adetailedexplanationof thecomponentsof eachGMMmodel isomittedforbrevity. Thekeyobservation is thatCEUB(CEAUB) isvery tight inmostof the investigatedcases. This is because that theupperenvelope that isusedtocomputeCEUB(CEAUB)givesaverygoodestimation of the inputsignal. Notice thatMEUBonlygivesanupperboundof thedifferential entropyasdiscussed inSection3. Ingeneral theproposedboundsaretighter thanMEUB.However, this isnot thecasewhenthemixture componentsaremergedtogetherandapproximateonesingleGaussian(andtherefore itsentropycan bewellapproximatedbytheGaussianentropy),asshowninthe last lineofFigure3. Forα-divergence, thebounds introducedinSections4.1–4.3aredenotedas“Basic”,“Adaptive” and“VR”,respectively. Figure4visualizestheseGMMsandplotstheestimationsoftheirα-divergences againstα. The red linesmean theupper envelope. Thedashedvertical linesmean theelementary intervals. ThecomponentsofGMM1andGMM2aremoreseparatedthanGMM3andGMM4. Therefore these twopairspresentdifferent cases. Fora clearpresentation, onlyVR(which is expected tobebetter thanBasicandAdaptive) is shown.Wecansee that,visually in thebigscale,VRtightlysurrounds the truevalue. 302

back to the book Differential Geometrical Theory of Statistics"

Differential Geometrical Theory of Statistics

Title: Differential Geometrical Theory of Statistics
Authors: Frédéric Barbaresco; Frank Nielsen
Editor: MDPI
Location: Basel
Date: 2017
Language: English
License: CC BY-NC-ND 4.0
ISBN: 978-3-03842-425-3
Size: 17.0 x 24.4 cm
Pages: 476
Keywords: Entropy, Coding Theory, Maximum entropy, Information geometry, Computational Information Geometry, Hessian Geometry, Divergence Geometry, Information topology, Cohomology, Shape Space, Statistical physics, Thermodynamics
Categories: Naturwissenschaften Physik