Seite - 267 - in Differential Geometrical Theory of Statistics
Bild der Seite - 267 -
Text der Seite - 267 -
Entropy2016,18, 277
ψ(t)= 12( √ t−1)2. Thekernel-basedMDϕDEiscalculatedusingtheGaussiankernel,andthewindow
is calculatedusingSilverman’s rule. We included in the comparison theminimumdensitypower
divergence (MDPD)of [14]. Theestimator isdefinedby:
φˆn = arginf
φ∈Φ ∫
p1+aφ (z)dz− a+1
a 1
n n
∑
i paφ(yi)
= arginf
φ∈Φ EPφ [
paφ ]
− a+1
a EPn [
paφ ]
, (22)
wherea∈ (0,1]. This isaBregmandivergenceandisknowntohavegoodefficiencyandrobustness for
agoodchoiceof thetradeoffparameter.Accordingtothesimulationresults in[11], thevalueofa=0.5
seems togiveagood tradeoff between robustness against outliers andagoodperformanceunder
themodel.Notice that theMDPDcoincideswithMLEwhen a tends tozero. Thus,ourmethodology
presentedhere in thisarticle, isapplicableonthisestimatorandtheproximalpointalgorithmcanbe
usedtocalculate theMDPD.Theproximal termwillbekept thesame, i.e.,ψ(t)= 12( √ t−1)2.
Remark 3 (Note on the robustness of theused estimators). In Section 3, we have proved undermild
conditions that the proximal point algorithm (11) ensures the decrease of the estimated divergence. This
means thatwhenwe use the dual Formulas (2) and (3), then the proximal point algorithm (11) returns at
convergence the estimatorsdefinedby (4) and (5), respectively. Similarly, ifweuse thedensitypowerdivergence
ofBasuet al. [14], then theproximal-pointalgorithmreturnsat convergence theMDPDdefinedby(22). The
robustness properties of the dual estimators (4) and (5) are studied in [12] and [11] respectively using the
influence function (IF)approach.Ontheotherhand, the robustnessproperties of theMDPDare studiedusing
the IFapproach in [14]. TheMDϕDE(4)hasgenerally anunbounded IF (see [12]Section3.1),whereas the
kernel-basedMDϕDE’s IFmaybebounded for example inaGaussianmodel and foranyϕ−divergencewith
ϕ=ϕγwithγ∈ (0,1), see [11]Example2.Ontheotherhand, theMDPDhasgenerallyabounded IF if the
tradeoff parameter a ispositive, and, inparticular, in theGaussianmodel. TheMDPDbecomesmore robustas
the tradeoff parametera increases (seeSection3.3 in [14]). Therefore,weshouldexpect that theproximalpoint
algorithmproduces robust estimators in the caseof thekernel-basedMDϕDEandtheMDPD,andthusobtain
better results than theMLEcalculatedusing theEMalgorithm.
Simulations from twomixturemodels are given below—aGaussianmixture and aWeibull
mixture. TheMLEforbothmixtureswascalculatedusingtheEMalgorithm.
OptimizationswerecarriedoutusingtheNelder–Meadalgorithm[22]under thestatistical tool
R[23].Numerical integrations in theGaussianmixturewerecalculatedusingthedistrExIntegrate
functionofpackagedistrEx. It isaslightmodificationofthestandardfunctionintegrate. Itperforms
aGauss–Legendrequadraturewhen function integrate returns an error. In theWeibullmixture,
we used the integral function from package pracma. Function integral includes a variety of
adaptive numerical integrationmethods such asKronrod–Gauss quadrature, Romberg’smethod,
Gauss–Richardsonquadrature,Clenshaw–Curtis (notadaptive)and(adaptive)Simpson’smethod.
Althoughfunctionintegral is slow, itperformsbetter thanother functionsevenif the integrandhas
arelativelybadbehavior.
5.1. TheTwo-ComponentGaussianMixtureRevisited
We consider the Gaussian mixture (17) presented earlier with true parameters λ = 0.35,
μ1=−2,μ2 = 1.5 and known variances equal to 1. Contamination was done by adding in the
originalsampletothefivelowestvaluesrandomobservationsfromtheuniformdistributionU[−5,−2].
Wealsoaddedto thefive largestvalues randomobservations fromtheuniformdistributionU[2,5].
Results are summarized inTable 1. TheEMalgorithmwas initializedaccording to condition (20).
Thisconditiongavegoodresultswhenweareunder themodel,whereas itdidnotalwaysresult in
267
Differential Geometrical Theory of Statistics
- Titel
- Differential Geometrical Theory of Statistics
- Autoren
- Frédéric Barbaresco
- Frank Nielsen
- Herausgeber
- MDPI
- Ort
- Basel
- Datum
- 2017
- Sprache
- englisch
- Lizenz
- CC BY-NC-ND 4.0
- ISBN
- 978-3-03842-425-3
- Abmessungen
- 17.0 x 24.4 cm
- Seiten
- 476
- Schlagwörter
- Entropy, Coding Theory, Maximum entropy, Information geometry, Computational Information Geometry, Hessian Geometry, Divergence Geometry, Information topology, Cohomology, Shape Space, Statistical physics, Thermodynamics
- Kategorien
- Naturwissenschaften Physik