Seite - 264 - in Differential Geometrical Theory of Statistics

Bild der Seite - 264 -

Text der Seite - 264 -

Entropy2016,18, 277 Conclusion1. UsingPropositions4and1, ifΦ=[η,1−η]× [μmin,μmax]2, the sequence (Dˆϕ(pφk,pφT))k deﬁned throughFormula (2) convergesand there exists a subsequence (φN(k))whichconverges toa stationary pointof the estimateddivergence.Moreover, every limitpointof the sequence (φk)k is a stationarypointof the estimateddivergence. Ifwe are using the kernel-baseddual estimator given by (3)with aGaussian kernel density estimator, thenfunctionφ → Dˆϕ(pφ,pφT) is continuouslydifferentiableoverΦevenif themeansμ1 andμ2 arenotbounded. Forexample, takeϕ=ϕγ deﬁnedby(1). There isoneconditionwhichrelates thewindowof thekernel, sayw,with thevalueofγ. Indeed,usingFormula (3),wecanwrite Dˆϕ(pφ,pφT)= 1 γ−1 ∫ pγφ Kγ−1n,w (y)dy− 1 γn n ∑ i=1 pγφ Kγn,w (yi)− 1γ(γ−1). Inorder tostudythecontinuityandthedifferentiabilityof theestimateddivergencewithrespect toφ, it sufﬁces tostudythe integral term.Wehave pγφ Kγ−1n,w (y)= ( λ√ 2π exp [ −12(y−μ1)2 ] + 1−λ√ 2π exp [ −12(y−μ2)2 ])γ ( 1 nw∑ n i=1exp [ −(y−yi)22w2 ])γ−1 . The dominating term at inﬁnity in the nominator is exp(−γy2/2), whereas it is exp(−(γ−1)y2/(2w2)) in thedenominator. It sufﬁcesnowinorder that the integrandtobebounded byan integrable function independentlyofφ= (λ,μ) thatwehave−γ+(γ−1)/w2< 0. That is −γw2+γ−1<0,which isequivalent toγ(w2−1)<−1. Thisargumentalsoholds ifwedifferentiate the integrandwithrespect toλoreitherof themeansμ1 orμ2. Forγ=2(thePearson’sχ2),weneed w2>1/2. Forγ=1/2(theHellinger), there isnoconditiononw. ClosednessofΦ0 isprovedsimilarlytothepreviouscase. Boundedness,however,mustbetreated differentlysinceΦ isnotnecessarilycompactandis supposedtobeΦ=[η,1−η]×R2. Forsimplicity, takeϕ=ϕγ. The idea is tochooseφ0 an initializationfor theproximalalgorithminawaythatΦ0does not includeunboundedvaluesof themeans. Continuityofφ → Dˆϕ(pφ,pφT)permits calculationof the limitswheneither (orboth)of themeans tends to inﬁnity. Ifboth themeansgoto inﬁnity, then pφ(x)→0,∀x. Thus, forγ∈ (0,∞)\{1},wehave Dˆϕ(pφ,pφT)→ 1γ(γ−1). Forγ<0, the limit is inﬁnity. Ifonlyoneof themeans tends to∞, thenthecorrespondingcomponentvanishes fromthemixture. Thus, ifwechooseφ0 suchthat: Dˆϕ(pφ0,pφT) < min ( 1 γ(γ−1),infλ,μDˆϕ(p(λ,∞,μ),pφT) ) ifγ∈ (0,∞)\{1}, (18) Dˆϕ(pφ0,pφT) < inf λ,μ Dˆϕ(p(λ,∞,μ),pφT) ifγ<0, (19) thenthealgorithmstartsatapointofΦwhosefunctionvalue is inferior to the limitsof Dˆϕ(pφ,pφT) at inﬁnity. ByProposition 1, the algorithmwill continue todecrease thevalue of Dˆϕ(pφ,pφT) and nevergoesback to the limitsat inﬁnity. Inaddition, thedeﬁnitionofΦ0 permits toconclude that if φ0 is chosenaccording toconditions (18)and(19), thenΦ0 isbounded. Thus,Φ0 becomescompact. Unfortunately thevalueof infλ,μ Dˆϕ(p(λ,∞,μ),pφT)canbecalculatedbutnumerically.Wewill seenext that in thecaseof the likelihoodfunction,asimilarconditionwillbe imposedfor thecompactnessof Φ0, andtherewillbenoneedforanynumerical calculus. Conclusion 2. UsingPropositions 4 and 1, under conditions (18) and (19) the sequence (Dˆϕ(pφk,pφT))k deﬁned throughFormula (3) converges and there exists a subsequence (φN(k)) that converges to a stationary 264

zurück zum Buch Differential Geometrical Theory of Statistics"

Differential Geometrical Theory of Statistics

Titel: Differential Geometrical Theory of Statistics
Autoren: Frédéric Barbaresco; Frank Nielsen
Herausgeber: MDPI
Ort: Basel
Datum: 2017
Sprache: englisch
Lizenz: CC BY-NC-ND 4.0
ISBN: 978-3-03842-425-3
Abmessungen: 17.0 x 24.4 cm
Seiten: 476
Schlagwörter: Entropy, Coding Theory, Maximum entropy, Information geometry, Computational Information Geometry, Hessian Geometry, Divergence Geometry, Information topology, Cohomology, Shape Space, Statistical physics, Thermodynamics
Kategorien: Naturwissenschaften Physik