Seite - 261 - in Differential Geometrical Theory of Statistics
Bild der Seite - 261 -
Text der Seite - 261 -
Entropy2016,18, 277
If (φk)kdoesnotconverge. SinceΦ0 is compactand∀k,φk∈Φ0 (provedinProposition1), there
exists a subsequence (φN0(k))k such thatφN0(k)→ φ˜. Let us take the subsequence (φN0(k)−1)k. This
subsequencedoesnotnecessarilyconverge; it isstillcontainedinthecompactΦ0, sothatwecanextract
a further subsequence (φN1◦N0(k)−1)kwhichconverges to, say, φ¯. Now, thesubsequence (φN1◦N0(k))k
converges to φ˜, because it isasubsequenceof (φN0(k))k.Wehaveproveduntilnowtheexistenceof two
convergentsubsequencesφN(k)−1 andφN(k)withaprioridifferent limits. Forsimplicityandwithout
any lossofgenerality,wewill consider thesesubsequences tobeφk andφk+1, respectively.
Conservingpreviousnotations, suppose thatφk+1→ φ˜andφk→ φ¯.Weuseagain inequality (13):
Dˆ(pφk+1,pφT)+Dψ(φ k+1,φk)≤ Dˆ(pφk,pφT).
Bytakingthelimitsof thetwopartsof theinequalityask tendstoinfinity,andusingthecontinuity
of the twofunctions,wehave
Dˆ(pφ˜,pφT)+Dψ(φ˜,φ¯)≤ Dˆ(pφ¯,pφT).
Recall that underA1-2, the sequence (
Dˆϕ(pφk,pφT) )
k converges, so that it has the same limit
for any subsequence, i.e., Dˆ(pφ˜,pφT) = Dˆ(pφ¯,pφT). We also use the fact that the distance-like
functionDψ is nonnegative todeduce thatDψ(φ˜,φ¯) = 0. Looking closely at thedefinitionof this
divergence(10),wegetthatif thesumiszero, theneachtermisalsozerosinceall termsarenonnegative.
Thismeans that:
∀i∈{1,··· ,n}, ∫
X ψ ( hi(x|φ˜)
hi(x|φ¯) )
hi(x|φ¯)dx=0.
The integrandsarenonnegative functions, so theyvanishalmosteverywherewithrespect to the
measuredxdefinedonthespaceof labels.
∀i∈{1,··· ,n}, ψ ( hi(x|φ˜)
hi(x|φ¯) )
hi(x|φ¯)=0 dx−a.e.
Theconditionaldensitieshiaresupposedtobepositive(whichcanbeensuredbyasuitablechoice
of the initialpointφ0), i.e.,hi(x|φ¯)>0,dx−a.e.Hence,ψ ( hi(x|φ˜)
hi(x|φ¯) )
=0,dx−a.e.Ontheotherhand,ψ
is chosen inawaythatψ(z)=0 iffz=1. Therefore:
∀i∈{1,··· ,n}, hi(x|φ˜)=hi(x|φ¯) dx−a.e. (14)
Sinceφk+1 is,bydefinition,an infimumofφ → Dˆ(pφ,pφT)+Dψ(φ,φk), thenthegradientof this
function iszeroonφk+1. It results that:
∇Dˆ(pφk+1,pφT)+∇Dψ(φk+1,φk)=0, ∀k.
Takingthe limitonk, andusingthecontinuityof thederivatives,weget that:
∇Dˆ(pφ˜,pφT)+∇Dψ(φ˜,φ¯)=0. (15)
Letuswriteexplicitly thegradientof theseconddivergence:
∇Dψ(φ˜,φ¯)= n
∑
i=1 ∫
X ∇hi(x|φ˜)
hi(x|φ¯) ψ ′ ( hi(x|φ˜)
hi(x|φ¯) )
hi(x|φ¯).
Weusenowthe identities (14), andthe fact thatψ′(1)=0, todeduce that:
∇Dψ(φ˜,φ¯)=0.
261
Differential Geometrical Theory of Statistics
- Titel
- Differential Geometrical Theory of Statistics
- Autoren
- Frédéric Barbaresco
- Frank Nielsen
- Herausgeber
- MDPI
- Ort
- Basel
- Datum
- 2017
- Sprache
- englisch
- Lizenz
- CC BY-NC-ND 4.0
- ISBN
- 978-3-03842-425-3
- Abmessungen
- 17.0 x 24.4 cm
- Seiten
- 476
- Schlagwörter
- Entropy, Coding Theory, Maximum entropy, Information geometry, Computational Information Geometry, Hessian Geometry, Divergence Geometry, Information topology, Cohomology, Shape Space, Statistical physics, Thermodynamics
- Kategorien
- Naturwissenschaften Physik