Page - 366 - in Differential Geometrical Theory of Statistics
Image of the Page - 366 -
Text of the Page - 366 -
Entropy2016,18, 98
parameters Y¯∈Pm andσ>0,anditsdensitywithrespect to theRiemannianvolumeformdv(Y)of
Pm (seeFormula (13) inSection2) is:
1
Zm(σ) exp [
−d 2(Y,Y)
2σ2 ]
(2)
whereZm(σ) isanormalizingfactordependingonlyonσ (andnoton Y¯).
For theGaussian distribution Equation (2), themaximum likelihood estimate (MLE) for the
parameter Y¯basedonobservationsY1, · · · ,Yn correspondstothemeanEquation(1). In[15],adetailed
studyofstatistical inference for thisdistributionwasgivenandthenappliedto theclassificationof
data inPm, showingthat ityieldsbetterperformance, incomparisontorecentapproaches [2].
Whenadatasetcontainsextremevalues (oroutliers),becauseof the impactof thesevaluesond2,
themeanbecomes lessuseful. It isusuallyreplacedwith theRiemannianmedian:
Median(Y1, · · · ,Yn)=argminY∈Pm n
∑
i=1 d(Y,Yi) (3)
DefinitionEquation(3)correspondsto thatof themedianinstatisticsbasedonorderingof the
values of a sequence. However, this interpretationdoes not continue to hold onPm. In fact, the
Riemannian distance onPm is not associatedwith any norm, and it is therefore only possible to
comparedistancesofasetofmatrices toareferencematrix.
In thepresenceofoutliers, theGaussiandistributiononPm also loses its robustnessproperties.
Themaincontributionof thepresentpaper is toremedythisproblembyintroducingtheRiemannian
Laplacedistributionwhilemaintainingthesameone-to-onerelationbetweenMLEandtheRiemannian
median. Thiswillbeshowntoofferconsiderable improvement indealingwithoutliers.
Thispaper isorganizedas follows.
Section 2 reviews theRiemanniangeometryofPm,when thismanifold is equippedwith the
RiemannianmetricknownastheRao–Fisheroraffine invariantmetric [10,11]. Inparticular, itgives
analytic expressions for geodesic curves, Riemanniandistance and recalls the invariance ofRao’s
distanceunderaffinetransformations.
Section3introducestheLaplacedistributionL(Y¯,σ) throughitsprobabilitydensityfunctionwith
respect to thevolumeformdv(Y):
p(Y|Y,σ)= 1
ζm(σ) exp [
−d(Y,Y)
σ ]
here, σ lies in an interval ]0,σmax[withσmax<∞. This is because thenormalizing constant ζm(σ)
becomes infinite forσ≥ σmax. Itwill be shownthat ζm(σ)dependonlyonσ (andnoton Y¯) forall
σ<σmax. This important fact leads tosimpleexpressionsofMLEsofYandσ. Inparticular, theMLE
of Y¯basedona familyofobservationsY1, · · · ,YN sampled fromL(Y¯,σ) isgivenby themedianof
Y1, · · · ,YN definedbyEquation(3)whered isRao’sdistance.
Section4focusesonmixturesofRiemannianLaplacedistributionsonPm. Adistributionof this
kindhasadensity:
p(Y|(ωμ,Yμ,σμ)1≤μ≤M)= M
∑
μ=1 μp(Y|Yμ,σμ) (4)
withrespect tothevolumeform dv(Y).Here, M is thenumberofmixturecomponents, μ > 0, Yμ∈
Pm,σμ>0 forall 1≤μ≤Mand∑Mμ=1 μ=1.AnewEM(expectation-maximization)algorithmthat
computesmaximumlikelihoodestimatesof themixtureparameters ( μ,Y¯μ,σμ)1≤μ≤M isprovided.
Theproblemof theorderselectionof thenumberM inEquation(4) isalsodiscussedandperformed
usingtheBayesian informationcriterion(BIC) [16].
366
Differential Geometrical Theory of Statistics
- Title
- Differential Geometrical Theory of Statistics
- Authors
- Frédéric Barbaresco
- Frank Nielsen
- Editor
- MDPI
- Location
- Basel
- Date
- 2017
- Language
- English
- License
- CC BY-NC-ND 4.0
- ISBN
- 978-3-03842-425-3
- Size
- 17.0 x 24.4 cm
- Pages
- 476
- Keywords
- Entropy, Coding Theory, Maximum entropy, Information geometry, Computational Information Geometry, Hessian Geometry, Divergence Geometry, Information topology, Cohomology, Shape Space, Statistical physics, Thermodynamics
- Categories
- Naturwissenschaften Physik