Seite - 265 - in Differential Geometrical Theory of Statistics
Bild der Seite - 265 -
Text der Seite - 265 -
Entropy2016,18, 277
pointof the estimateddivergence.Moreover, every limitpointof the sequence (φk)k is a stationarypointof the
estimateddivergence.
In thecaseof the likelihoodϕ(t)=−log(t)+ t−1, thesetΦ0 canbewrittenas:
Φ0 = {
φ∈Φ, JN(φ)≥ JN(φ0) }
= J−1N (
[JN(φ0),+∞) )
,
where JN is the log-likelihoodfunctionof theGaussianmixturemodel. The log-likelihoodfunction JN
is clearlyofclassC1(int(Φ)).WeprovethatΦ0 isclosedandboundedwhich issufficient toconclude its
compactness, since thespace [η,1−η]×R2providedwith theeuclideandistance iscomplete.
Closedness. The setΦ0 is the inverse imagebya continuous function (the log-likelihood) of a
closedset. Therefore it is closed in [η,1−η]×R2.
Boundedness.Bycontradiction, suppose thatΦ0 isunbounded, thenthereexistsasequence (φl)l
whichtends to infinity. Sinceλl∈ [η,1−η], theneitherofμl1 orμl2 tends to infinity. Suppose thatboth
μl1 andμ l
2 tend to infinity,we thenhave JN(φl)→−∞. Anyfinite initializationφ0 will imply that
JN(φ0)>−∞so that∀φ∈Φ0, JN(φ)≥ JN(φ0)>−∞. Thus, it is impossible forbothμl1 andμl2 togo
to infinity.
Suppose thatμl1→∞, and thatμl2 converges (or thatμl2 isbounded; in suchcaseweextract a
convergentsubsequence) toμ2. The limitof the likelihoodhas the form:
L(λ,∞,φ2)= n
∏
i=1 (1−λ)√
2π e− 1
2(yi−μ2)2,
which isboundedbyitsvalue forλ=0andμ2= 1n∑ n
i=1yi. Indeed, since1−λ≤1,wehave:
L(λ,∞,φ2)≤ n
∏
i=1 1√
2π e− 1
2(yi−μ2)2.
Theright-handsideof this inequality is the likelihoodofaGaussianmodelN(μ2,0), so that it is
maximizedwhenμ2= 1n∑ n
i=1yi. Thus, ifφ
0 is chosen inawaythat JN(φ0)> JN (
0,∞, 1n∑ n
i=1yi )
, the
casewhenμ1 tends to infinityandμ2 isboundedwouldneverbeallowed. For theothercasewhere
μ2→∞andμ1 isbounded,wechooseφ0 inawaythat JN(φ0)> JN (
1, 1n∑ n
i=1yi,∞ )
. Inconclusion,
withachoiceofφ0 suchthat:
JN(φ0)>max [
JN (
0,∞, 1
n n
∑
i=1 yi )
, JN (
1, 1
n n
∑
i=1 yi,∞ )]
, (20)
thesetΦ0 isbounded.
This conditiononφ0 isverynatural andmeans thatweneed tobeginatapointat leastbetter
thantheextremecaseswhereweonlyhaveonecomponent in themixture. Thiscanbeeasilyverified
bychoosingarandomvectorφ0, andcalculating thecorresponding log-likelihoodvalue. If JN(φ0)
doesnotverify thepreviouscondition,wedrawagainanotherrandomvectoruntil satisfaction.
Conclusion3. UsingPropositions4and1,under condition (20) the sequence (JN(φk))k convergesand there
exists a subsequence (φN(k))whichconverges toa stationarypointof the likelihood function.Moreover, every
limitpointof the sequence (φk)k is a stationarypointof the likelihood.
AssumptionA3isnot fulfilled (thispartapplies forallaforementionedsituations).Asmentioned
inthepaperofTseng[2], for the twoGaussianmixtureexample,bychangingμ1 andμ2 bythesame
265
Differential Geometrical Theory of Statistics
- Titel
- Differential Geometrical Theory of Statistics
- Autoren
- Frédéric Barbaresco
- Frank Nielsen
- Herausgeber
- MDPI
- Ort
- Basel
- Datum
- 2017
- Sprache
- englisch
- Lizenz
- CC BY-NC-ND 4.0
- ISBN
- 978-3-03842-425-3
- Abmessungen
- 17.0 x 24.4 cm
- Seiten
- 476
- Schlagwörter
- Entropy, Coding Theory, Maximum entropy, Information geometry, Computational Information Geometry, Hessian Geometry, Divergence Geometry, Information topology, Cohomology, Shape Space, Statistical physics, Thermodynamics
- Kategorien
- Naturwissenschaften Physik