Seite - 265 - in Differential Geometrical Theory of Statistics

Bild der Seite - 265 -

Text der Seite - 265 -

Entropy2016,18, 277 pointof the estimateddivergence.Moreover, every limitpointof the sequence (φk)k is a stationarypointof the estimateddivergence. In thecaseof the likelihoodϕ(t)=−log(t)+ t−1, thesetΦ0 canbewrittenas: Φ0 = { φ∈Φ, JN(φ)≥ JN(φ0) } = J−1N ( [JN(φ0),+∞) ) , where JN is the log-likelihoodfunctionof theGaussianmixturemodel. The log-likelihoodfunction JN is clearlyofclassC1(int(Φ)).WeprovethatΦ0 isclosedandboundedwhich issufﬁcient toconclude its compactness, since thespace [η,1−η]×R2providedwith theeuclideandistance iscomplete. Closedness. The setΦ0 is the inverse imagebya continuous function (the log-likelihood) of a closedset. Therefore it is closed in [η,1−η]×R2. Boundedness.Bycontradiction, suppose thatΦ0 isunbounded, thenthereexistsasequence (φl)l whichtends to inﬁnity. Sinceλl∈ [η,1−η], theneitherofμl1 orμl2 tends to inﬁnity. Suppose thatboth μl1 andμ l 2 tend to inﬁnity,we thenhave JN(φl)→−∞. Anyﬁnite initializationφ0 will imply that JN(φ0)>−∞so that∀φ∈Φ0, JN(φ)≥ JN(φ0)>−∞. Thus, it is impossible forbothμl1 andμl2 togo to inﬁnity. Suppose thatμl1→∞, and thatμl2 converges (or thatμl2 isbounded; in suchcaseweextract a convergentsubsequence) toμ2. The limitof the likelihoodhas the form: L(λ,∞,φ2)= n ∏ i=1 (1−λ)√ 2π e− 1 2(yi−μ2)2, which isboundedbyitsvalue forλ=0andμ2= 1n∑ n i=1yi. Indeed, since1−λ≤1,wehave: L(λ,∞,φ2)≤ n ∏ i=1 1√ 2π e− 1 2(yi−μ2)2. Theright-handsideof this inequality is the likelihoodofaGaussianmodelN(μ2,0), so that it is maximizedwhenμ2= 1n∑ n i=1yi. Thus, ifφ 0 is chosen inawaythat JN(φ0)> JN ( 0,∞, 1n∑ n i=1yi ) , the casewhenμ1 tends to inﬁnityandμ2 isboundedwouldneverbeallowed. For theothercasewhere μ2→∞andμ1 isbounded,wechooseφ0 inawaythat JN(φ0)> JN ( 1, 1n∑ n i=1yi,∞ ) . Inconclusion, withachoiceofφ0 suchthat: JN(φ0)>max [ JN ( 0,∞, 1 n n ∑ i=1 yi ) , JN ( 1, 1 n n ∑ i=1 yi,∞ )] , (20) thesetΦ0 isbounded. This conditiononφ0 isverynatural andmeans thatweneed tobeginatapointat leastbetter thantheextremecaseswhereweonlyhaveonecomponent in themixture. Thiscanbeeasilyveriﬁed bychoosingarandomvectorφ0, andcalculating thecorresponding log-likelihoodvalue. If JN(φ0) doesnotverify thepreviouscondition,wedrawagainanotherrandomvectoruntil satisfaction. Conclusion3. UsingPropositions4and1,under condition (20) the sequence (JN(φk))k convergesand there exists a subsequence (φN(k))whichconverges toa stationarypointof the likelihood function.Moreover, every limitpointof the sequence (φk)k is a stationarypointof the likelihood. AssumptionA3isnot fulﬁlled (thispartapplies forallaforementionedsituations).Asmentioned inthepaperofTseng[2], for the twoGaussianmixtureexample,bychangingμ1 andμ2 bythesame 265

zurück zum Buch Differential Geometrical Theory of Statistics"

Differential Geometrical Theory of Statistics

Titel: Differential Geometrical Theory of Statistics
Autoren: Frédéric Barbaresco; Frank Nielsen
Herausgeber: MDPI
Ort: Basel
Datum: 2017
Sprache: englisch
Lizenz: CC BY-NC-ND 4.0
ISBN: 978-3-03842-425-3
Abmessungen: 17.0 x 24.4 cm
Seiten: 476
Schlagwörter: Entropy, Coding Theory, Maximum entropy, Information geometry, Computational Information Geometry, Hessian Geometry, Divergence Geometry, Information topology, Cohomology, Shape Space, Statistical physics, Thermodynamics
Kategorien: Naturwissenschaften Physik