Page - 331 - in Differential Geometrical Theory of Statistics
Image of the Page - 331 -
Text of the Page - 331 -
Entropy2016,18, 421
anddiscretegraphicalmodels. Testing isoftenusedtochecktheconsistencyofaparametricmodel
withgivendata, andtocheckdependencyassumptionssuchas independencebetweencategorical
variables.However,wenotean importantcaveat: aspointedoutby[14,15], the fact thataparametric
modelâpassesâagoodness-of-ïŹt testonlyweaklyconstrains the resulting inference. Theessential
point here is that goodness-of-ïŹt is a necessary, but not sufïŹcient, condition for model choice,
sinceâingeneralâmanymodelswillbeempiricallysupported. This issuehasrecentlybeenexplored
geometrically in [16]usingCIG.
Therehavebeenmanypossible test statisticsproposed forgoodness-of-ïŹt testing, andoneof
theattractionsof thePower-Divergence family,deïŹned in (11), is that themost importantonesare
included in the family and indexedbya single scalar λ. Of course,when there is a choice of test
statistic, different inferences can result fromdifferent choices. Oneof themain themesof [5] is to
give theanalyst insightabout selectingaparticularλ. Keyconsiderations formaking theselection
ofλ include the tractabilityof the samplingdistribution, its poweragainst important alternatives,
andinterpretationwhenhypothesesarerejected.
Thefirstorder,asymptotic inN,Ï2-samplingdistributionforallmembersof thePower-Divergence
family,which is appropriatewhenall observedcounts are âlarge enoughâ, is themost commonly
usedtool, andaveryattractive featureof the family.However, thiscanfailbadly in theâsparseâcase
andwhen themodel is close to theboundary. Elementary,momentbasedcorrections, to improve
small sampleperformance,arediscussedin[5] (Chapter5).Moreformalasymptoticapproaches to
these issues includethedoublyasymptotic, inNandk, approachof [17],discussedinSection2and
similarnormalapproximation ideas in [18]. Seealso [19]. Extensivesimulationexperimentshavebeen
undertakento learn inpracticewhat âlargeenoughâmeans, see [5,20,21].
Whentherearenuisanceparameters tobeestimated(as iscommon), [22]pointsout that it is the
samplingdistribution conditionalupontheseestimateswhichneeds tobeapproximated,andproposes
higher ordermethodsbasedon theEdgeworth expansion. Simulation approaches are oftenused
in the conditional context due to the common intractability of the conditional distribution [23,24],
and importancesamplingmethodsplayan important roleâsee [25â27].Otherapproachesused to
investigatethesamplingdistributionincludejackkniïŹng[28], theChenâSteinmethod[29],anddetailed
asymptoticanalysis in [30â32].
Inveryhighdimensionalmodel spaces, considerationsof thepowerof tests rarelygenerates
uniformly best procedures but,we feel, geometry can be an important tool in understanding the
choices thatneedtobemade. Further, [5], states thesituation isâcomplicatedâ, showingthis through
simulationexperiments.Oneof thereasons forReadandCressieâspreferredchoiceofλ=2/3is its
goodpoweragainst someimportant typesofalternativeâtheso-calledbumpordipcasesâaswellas
therelative tractabilityof its samplingdistributionunder thenull.Otherconsiderationsaboutpower
canbefoundin[33]which looksspeciïŹcallyatmixturemodelbasedalternatives.
3.3. Linkswith InformationGeometry
At the time that the Power-Divergence family was being examined, there was a parallel
development in InformationGeometry; oddly,however, it seemedtohave takensometimebefore
the links between the two areas were fully recognised. A good treatment of these links can be
found in [6] (Chapter 9). Since it is important to understand the extreme values of divergence
functions, considerationsofconvexitycanclearlyplayanimportantrole. ThegeneralclassofBregman
divergences, [6,34] (page240), and[35] (page13) isveryusefulhere. ForeachBregmandivergence,
therewill existafïŹneparametersof theexponential family inwhich thedivergence function isconvex.
In theclassofproductPoissonmodelsâwhichare thekeybuildingblocksof logâlinearmodelsâall
membersof thePower-Divergence familyhavetheBregmanproperty. Theseare thenα-divergences,
capableofgenerating thecomplete InformationGeometryof themodel [35],with the linkbetweenα
andλgiven inTable1. Theα-representationhighlights thedualityproperties,whichareacornerstone
of InformationGeometry,butwhich is ratherhidden in theλ representation. TheBregmandivergence
331
Differential Geometrical Theory of Statistics
- Title
- Differential Geometrical Theory of Statistics
- Authors
- Frédéric Barbaresco
- Frank Nielsen
- Editor
- MDPI
- Location
- Basel
- Date
- 2017
- Language
- English
- License
- CC BY-NC-ND 4.0
- ISBN
- 978-3-03842-425-3
- Size
- 17.0 x 24.4 cm
- Pages
- 476
- Keywords
- Entropy, Coding Theory, Maximum entropy, Information geometry, Computational Information Geometry, Hessian Geometry, Divergence Geometry, Information topology, Cohomology, Shape Space, Statistical physics, Thermodynamics
- Categories
- Naturwissenschaften Physik