Page - 366 - in Differential Geometrical Theory of Statistics

Image of the Page - 366 -

Text of the Page - 366 -

Entropy2016,18, 98 parameters Y¯∈Pm andσ>0,anditsdensitywithrespect to theRiemannianvolumeformdv(Y)of Pm (seeFormula (13) inSection2) is: 1 Zm(σ) exp [ −d 2(Y,Y) 2σ2 ] (2) whereZm(σ) isanormalizingfactordependingonlyonσ (andnoton Y¯). For theGaussian distribution Equation (2), themaximum likelihood estimate (MLE) for the parameter Y¯basedonobservationsY1, · · · ,Yn correspondstothemeanEquation(1). In[15],adetailed studyofstatistical inference for thisdistributionwasgivenandthenappliedto theclassiﬁcationof data inPm, showingthat ityieldsbetterperformance, incomparisontorecentapproaches [2]. Whenadatasetcontainsextremevalues (oroutliers),becauseof the impactof thesevaluesond2, themeanbecomes lessuseful. It isusuallyreplacedwith theRiemannianmedian: Median(Y1, · · · ,Yn)=argminY∈Pm n ∑ i=1 d(Y,Yi) (3) DeﬁnitionEquation(3)correspondsto thatof themedianinstatisticsbasedonorderingof the values of a sequence. However, this interpretationdoes not continue to hold onPm. In fact, the Riemannian distance onPm is not associatedwith any norm, and it is therefore only possible to comparedistancesofasetofmatrices toareferencematrix. In thepresenceofoutliers, theGaussiandistributiononPm also loses its robustnessproperties. Themaincontributionof thepresentpaper is toremedythisproblembyintroducingtheRiemannian Laplacedistributionwhilemaintainingthesameone-to-onerelationbetweenMLEandtheRiemannian median. Thiswillbeshowntoofferconsiderable improvement indealingwithoutliers. Thispaper isorganizedas follows. Section 2 reviews theRiemanniangeometryofPm,when thismanifold is equippedwith the RiemannianmetricknownastheRao–Fisherorafﬁne invariantmetric [10,11]. Inparticular, itgives analytic expressions for geodesic curves, Riemanniandistance and recalls the invariance ofRao’s distanceunderafﬁnetransformations. Section3introducestheLaplacedistributionL(Y¯,σ) throughitsprobabilitydensityfunctionwith respect to thevolumeformdv(Y): p(Y|Y,σ)= 1 ζm(σ) exp [ −d(Y,Y) σ ] here, σ lies in an interval ]0,σmax[withσmax<∞. This is because thenormalizing constant ζm(σ) becomes inﬁnite forσ≥ σmax. Itwill be shownthat ζm(σ)dependonlyonσ (andnoton Y¯) forall σ<σmax. This important fact leads tosimpleexpressionsofMLEsofYandσ. Inparticular, theMLE of Y¯basedona familyofobservationsY1, · · · ,YN sampled fromL(Y¯,σ) isgivenby themedianof Y1, · · · ,YN deﬁnedbyEquation(3)whered isRao’sdistance. Section4focusesonmixturesofRiemannianLaplacedistributionsonPm. Adistributionof this kindhasadensity: p(Y|(ωμ,Yμ,σμ)1≤μ≤M)= M ∑ μ=1 μp(Y|Yμ,σμ) (4) withrespect tothevolumeform dv(Y).Here, M is thenumberofmixturecomponents, μ > 0, Yμ∈ Pm,σμ>0 forall 1≤μ≤Mand∑Mμ=1 μ=1.AnewEM(expectation-maximization)algorithmthat computesmaximumlikelihoodestimatesof themixtureparameters ( μ,Y¯μ,σμ)1≤μ≤M isprovided. Theproblemof theorderselectionof thenumberM inEquation(4) isalsodiscussedandperformed usingtheBayesian informationcriterion(BIC) [16]. 366

back to the book Differential Geometrical Theory of Statistics"

Differential Geometrical Theory of Statistics

Title: Differential Geometrical Theory of Statistics
Authors: Frédéric Barbaresco; Frank Nielsen
Editor: MDPI
Location: Basel
Date: 2017
Language: English
License: CC BY-NC-ND 4.0
ISBN: 978-3-03842-425-3
Size: 17.0 x 24.4 cm
Pages: 476
Keywords: Entropy, Coding Theory, Maximum entropy, Information geometry, Computational Information Geometry, Hessian Geometry, Divergence Geometry, Information topology, Cohomology, Shape Space, Statistical physics, Thermodynamics
Categories: Naturwissenschaften Physik