Web-Books
in the Austria-Forum
Austria-Forum
Web-Books
Naturwissenschaften
Physik
Differential Geometrical Theory of Statistics
Page - 254 -
  • User
  • Version
    • full version
    • text only version
  • Language
    • Deutsch - German
    • English

Page - 254 - in Differential Geometrical Theory of Statistics

Image of the Page - 254 -

Image of the Page - 254 - in Differential Geometrical Theory of Statistics

Text of the Page - 254 -

Entropy2016,18, 277 log-likelihoodfunctionbyanestimatorofaϕ−divergencebetweenthe truedistributionof thedataand themodel.Aϕ–divergence in thesenseofCsiszár [6] isdefinedin thesamewayas [7]by: Dϕ(Q,P)= ∫ ϕ ( dQ dP (y) ) dP(y), where ϕ is a nonnegative strictly convex function. Examples of such divergences are: the Kullback–Leibler (KL)divergence , themodifiedKLdivergence, theHellingerdistanceamongothers. All thesewell-knowndivergencesbelongto theclassofCressie-Readfunctions [8]definedby ϕγ(x)= xγ−γx+γ−1 γ(γ−1) forγ∈R\{0,1}. (1) forγ= 12,0,1respectively. Forγ∈{0,1}, the limit iscalculated,andwedenoteϕ0(x)=−logx+x−1 for thecaseof themodifiedKLandϕ1(x)= xlogx−x+1for theKL. Since the ϕ−divergence calculus uses theunknown truedistribution,weneed to estimate it. We consider the dual estimator of the divergence introduced independently by [9,10]. The use of this estimator is motivated by many reasons. Its minimum coincides with the MLE for ϕ(t)=−log(t)+ t−1. Inaddition, ithas thesameformfordiscreteandcontinuousmodels,anddoes notconsideranypartitioningorsmoothing. Let (Pφ)φ∈Φ beaparametricmodelwithΦ⊂Rd, anddenoteφT as the true set ofparameters. Letdybe theLebesguemeasuredefinedonR. Suppose that∀φ∈Φ, theprobabilitymeasurePφ is absolutelycontinuouswithrespect todyanddenote pφ thecorrespondingprobabilitydensity. The dualestimatorof theϕ−divergencegivenann−sampley1,··· ,yn isgivenby: Dˆϕ(pφ,pφT)= sup α∈Φ ∫ ϕ′ ( pφ pα ) (x)pφ(x)dx− 1n n ∑ i=1 ϕ# ( pφ pα ) (yi), (2) withϕ#(t)= tϕ′(t)−ϕ(t). AlMohamad[11]argues that this formulaworkswellunder themodel; however,whenweare not, this quantity largely underestimates thedivergence between the true distributionandthemodel,andproposes the followingmodification: D˜ϕ(pφ,pφT)= ∫ ϕ′ ( pφ Kn,w ) (x)pφ(x)dx− 1n n ∑ i=1 ϕ# ( pφ Kn,w ) (yi), (3) whereKn,w is theRosenblatt–Parzenkernel estimatewithwindowparameterw. Whether it is Dˆϕ, orD˜ϕ, theminimumdualϕ−divergenceestimator(MDϕDE)isdefinedastheargumentoftheinfimum of thedualapproximation: φˆn = arginf φ∈Φ Dˆϕ(pφ,pφT), (4) φ˜n = arginf φ∈Φ D˜ϕ(pφ,pφT). (5) Asymptoticpropertiesandconsistencyof these twoestimatorscanbefoundin[7,11]. Robustness propertieswere also studied using the influence function approach in [11,12]. The kernel-based MDϕDE(5)seemstobeabetterestimator thantheclassicalMDϕDE(4) in thesense that the former is robustwhereas the later isgenerallynot.Under themodel, theestimatorgivenby(4) is,however, moreefficient,especiallywhenthetruedensityof thedata isunbounded.Moreinvestigationisneeded in thecontextofunboundeddensities, sincewemayuseasymmetrickernels inorder to improvethe efficiencyof thekernel-basedMDϕDE,see [11] formoredetails. In thispaper,weproposecalculationof theMDϕDEusinganiterativeprocedurebasedonthe workofTseng [2] on the log-likelihood function. Thisprocedurehas the formofaproximalpoint 254
back to the  book Differential Geometrical Theory of Statistics"
Differential Geometrical Theory of Statistics
Title
Differential Geometrical Theory of Statistics
Authors
Frédéric Barbaresco
Frank Nielsen
Editor
MDPI
Location
Basel
Date
2017
Language
English
License
CC BY-NC-ND 4.0
ISBN
978-3-03842-425-3
Size
17.0 x 24.4 cm
Pages
476
Keywords
Entropy, Coding Theory, Maximum entropy, Information geometry, Computational Information Geometry, Hessian Geometry, Divergence Geometry, Information topology, Cohomology, Shape Space, Statistical physics, Thermodynamics
Categories
Naturwissenschaften Physik
Web-Books
Library
Privacy
Imprint
Austria-Forum
Austria-Forum
Web-Books
Differential Geometrical Theory of Statistics