Web-Books
in the Austria-Forum
Austria-Forum
Web-Books
Naturwissenschaften
Physik
Differential Geometrical Theory of Statistics
Page - 257 -
  • User
  • Version
    • full version
    • text only version
  • Language
    • Deutsch - German
    • English

Page - 257 - in Differential Geometrical Theory of Statistics

Image of the Page - 257 -

Image of the Page - 257 - in Differential Geometrical Theory of Statistics

Text of the Page - 257 -

Entropy2016,18, 277 Usingthe fact that thefirst termin Dˆϕ(pφ,pφT)doesnotdependonφ, so itdoesnotcount in the arginf definingφk+1,weeasilyget (7). Thesameapplies for thecaseof (3). Fornotational simplicity, fromnowon,weredefineDψ withanormalizationbyn, i.e., Dψ(φ,φk)= 1 n n ∑ i=1 ∫ X ψ ( hi(x|φ) hi(x|φk) ) hi(x|φk)dx. (10) Hence,oursetofalgorithmsis redefinedby: φk+1=arginf φ Dˆϕ(pφ,pφT)+Dψ(φ,φ k). (11) Wewill see later that this iteration forces thedivergence todecrease and that, under suitable conditions, it converges toa (local)minimumof Dˆϕ(pφ,pφT). It results thatalgorithm(11)beingaway tocalculateboth theMDϕDE(4)andthekernel-basedMDϕDE(5). 3. SomeConvergencePropertiesofφk Weshowherehow, according to somepossible situations, onemayproveconvergenceof the algorithmdefinedby(11). Letφ0 beagiven initialization,anddefine Φ0 :={φ∈Φ : Dˆϕ(pφ,pφT)≤ Dˆϕ(pφ0,pφT)}, whichwesuppose tobeasubsetof int(Φ). The ideaofdefining this set in this context is inherited fromthepaperWu[16],whichprovidedthefirst correctproof ofconvergence for theEMalgorithm. Beforegoinganyfurther,werecall the followingdefinitionofa (generalized)stationarypoint. Definition 1. Let f : Rd → R be a real valued function. If f is differentiable at a point φ∗ such that ∇f(φ∗)=0,we thensay thatφ∗ is a stationarypoint of f. If f isnotdifferentiable atφ∗ but the subgradientof fatφ∗, say∂f(φ∗), exists such that0∈∂f(φ∗), thenφ∗ is calledageneralized stationarypointof f. Remark 1. In the whole paper, the subgradient is defined for any function not necessarily convex (seeDefinition8.3) in [13] formoredetails. Wewillbeusingthe followingassumptions: A0. Functionsφ → Dˆϕ(pφ|pφT),Dψ are lowersemicontinuous; A1. Functionsφ → Dˆϕ(pφ|pφT),Dψ and∇1Dψ aredefinedandcontinuouson, respectively,Φ,Φ×Φ andΦ×Φ; AC. Functionφ →∇Dˆϕ(pφ|pφT) isdefinedandcontinuousonΦ; A2. Φ0 isacompactsubsetof int(Φ); A3. Dψ(φ,φ¯)>0forall φ¯ =φ∈Φ. Recall also thatwe suppose that hi(x|φ)> 0,dx−a.e.We relax the convexity assumption of functionψ.Weonlysuppose thatψ isnonnegativeandψ(t)=0 iff t=1. Inaddition,ψ′(t)=0 if t=1. Continuityanddifferentiabilityassumptionsof functionφ → Dˆϕ(pφ|pφT) for thecaseof (3)canbe easilycheckedusingLebesgue theorems. Thecontinuityassumptionfor thecaseof (2) canbechecked usingTheorem1.17orCorollary10.14 in [13].DifferentiabilitycanalsobecheckedusingCorollary 10.14orTheorem10.31 in thesamebook. InwhatconcernsDψ, continuityanddifferentiabilitycanbe obtainedmerelybyfulfillingLebesguetheoremsconditions.Whenworkingwithmixturemodels,we onlyneedthecontinuityanddifferentiabilityofψandfunctionshi. The later iseasilydeducedfrom regularityassumptionsonthemodel. ForassumptionA2, there isnouniversalmethod, seeSection4.2 foranExample.AssumptionA3canbecheckedusingLemma2in[2]. 257
back to the  book Differential Geometrical Theory of Statistics"
Differential Geometrical Theory of Statistics
Title
Differential Geometrical Theory of Statistics
Authors
Frédéric Barbaresco
Frank Nielsen
Editor
MDPI
Location
Basel
Date
2017
Language
English
License
CC BY-NC-ND 4.0
ISBN
978-3-03842-425-3
Size
17.0 x 24.4 cm
Pages
476
Keywords
Entropy, Coding Theory, Maximum entropy, Information geometry, Computational Information Geometry, Hessian Geometry, Divergence Geometry, Information topology, Cohomology, Shape Space, Statistical physics, Thermodynamics
Categories
Naturwissenschaften Physik
Web-Books
Library
Privacy
Imprint
Austria-Forum
Austria-Forum
Web-Books
Differential Geometrical Theory of Statistics