Page - 326 - in Differential Geometrical Theory of Statistics
Image of the Page - 326 -
Text of the Page - 326 -
Entropy2016,18, 421
Section2givesformalproofsoftworesults,Theorems1and2,whichwereannouncedin[2]. These
resultsexplore thesamplingperformanceofstandardgoodness-of-ïŹtstatisticsâWald,PearsonâsÏ2,
scoreanddevianceâinthesparsesetting. Inparticular, theylookat thecasewherethedatageneration
process isâclose to theboundaryâof theparameterspacewhereoneormorecellprobabilitiesvanish.
This complements results inmuchof the literature,where the centre of theparameter spaceâi.e.,
theuniformdistributionâisoften the focusofattention.
Section 3 starts with a review of the links between Information Geometry (IG) [3] and
goodness-of-ïŹt testing. Inparticular, it looksat thepower familyofCressieandRead[4,5] in termsof
thegeometric theoryofdivergences. In thecaseof regularexponential families, these linkshavebeen
well-exploredinthe literature[6],ashas thecorrespondingsamplingbehaviour[7].What isnovelhere
istheexplorationofthegeometrywithrespecttotheclosureoftheexponentialfamily; i.e., theextended
multinomialmodelâakeytool inCIG.Weillustratehowtheboundarycandominate thestatistical
properties inways thataresurprisingcomparedtostandardâandevenhigh-orderâanalyses,which
areasymptotic insamplesize.
Through simulation experiments, Section 4 explores the consequences of working in the
sparsemultinomial setting, with the design of the numerical experiments being inspired by the
informationgeometry.
2. SamplingDistributions intheSparseCase
Oneof theïŹrstmajor impacts that informationgeometryhadonstatisticalpracticewas through
the geometric analysis of higher order asymptotic theory (e.g., [8,9]). Geometric interpretations
and invariant expressions of terms in thehigher order corrections to approximations of sampling
distributionsareagoodexample, [8] (Chapter4).Geometric termsareusedtocorrect forskewnessand
otherhigherordermoment (cumulant) issues in thesamplingdistributions.However, thesecorrection
termsgrowvery largenear theboundary[1,10]. Since this regionplaysakeyrole inmodelling in the
sparsesettingâthemaximumlikelihoodestimator (MLE) oftenbeingontheboundaryâextensions to
theclassical theoryareneeded. Thispaper, togetherwith [2], start suchadevelopment. Thiswork
is related to similar ideas in categorical, (hierarchical) logâlinear, and graphicalmodels [1,11â13].
Asstated in [13], âtheir statisticalpropertiesundersparsesettingsarestill verypoorlyunderstood.
Asaresult, analysisof suchdataremainsexceptionallydifïŹcultâ.
In this sectionweshowwhy theWaldâequivalently, thePearsonÏ2 andscore statisticsâare
unworkablewhennear theboundaryof theextendedmultinomialmodel,but that thedeviancehasa
simple,accurate,andtractablesamplingdistributionâevenformoderatesamplesizes.Wealsoshow
howthehighermomentsof thedevianceareeasilycomputable, inprincipleallowingforhigherorder
adjustments.However,wealsomakesomeobservationsabout theappropriatenessof theseclassical
adjustments inSection4.
First,wedeïŹne somenotation, consistentwith that of [2]. With i rangingover{0,1,...,k}, let
n=(ni)âŒMultinomial (N,(Ïi)),wherehereeachÏi> 0. In this context, theWald,PearsonâsÏ2,
andscorestatisticsall coincide, their commonvalue,W, being
W := k
â
i=0 (Ïiâni/N)2
Ïi ⥠1
N2 k
â
i=0 n2i
Ïi â1.
DeïŹningÏ(α) :=âiÏαi ,wenote the inequality, foreachmâ„1,
Ï(âm)â(k+1)m+1â„0,
inwhichequalityholdsifandonlyifÏiâĄ1/(k+1)âi.e., iff(Ïi) isuniform.Wethenhavethefollowing
theorem,whichestablishesthat thestatisticW isunworkableasÏmin :=min(Ïi)â0forfixedkandN.
326
Differential Geometrical Theory of Statistics
- Title
- Differential Geometrical Theory of Statistics
- Authors
- Frédéric Barbaresco
- Frank Nielsen
- Editor
- MDPI
- Location
- Basel
- Date
- 2017
- Language
- English
- License
- CC BY-NC-ND 4.0
- ISBN
- 978-3-03842-425-3
- Size
- 17.0 x 24.4 cm
- Pages
- 476
- Keywords
- Entropy, Coding Theory, Maximum entropy, Information geometry, Computational Information Geometry, Hessian Geometry, Divergence Geometry, Information topology, Cohomology, Shape Space, Statistical physics, Thermodynamics
- Categories
- Naturwissenschaften Physik