Page - 327 - in Differential Geometrical Theory of Statistics
Image of the Page - 327 -
Text of the Page - 327 -
Entropy2016,18, 421
Theorem1. Fork>1andNā„6, theļ¬rst threemomentsofWare:
E(W)= k
N , Var(W)= {
Ļ(ā1)ā(k+1)2 }
+2k(Nā1)
N3
andE[{WāE(W)}3]givenby
{
Ļ(ā2)ā(k+1)3 }
ā(3k+25ā22N) {
Ļ(ā1)ā(k+1)2 }
+g(k,N)
N5 ,
whereg(k,N)=4(Nā1)k(k+2Nā5)>0.
Inparticular, forļ¬xedkandN,asĻminā0
Var(W)āāandγ(W)ā+ā,
whereγ(W) :=E[{WāE(W)}3]/{Var(W)}3/2.
Adetailedproof is foundinAppendixA,andwegivehereanoutlineof its important features.
Themachinerydeveloped iscapableofdeliveringmuchmore thanaproofofTheorem1.As indicated
there, it provides a generic way to explicitly compute arbitrary moments or mixed moments of
multinomialcounts,andcould inprinciplebe implementedbycomputeralgebra.Overall, thereare
fourstages. First, akeyrecurrencerelation isestablished; secondly, it is exploitedtodelivermoments
of a single cell count. Third,mixedmoments of anyorder arederived from those of lower order,
exploitingacertainfunctionaldependence. Finally, resultsarecombinedtoļ¬ndtheļ¬rst threemoments
ofW, highermomentsbeingsimilarlyobtainable.
The practical implication of Theorem 1 is that standard ļ¬rst (and higher-order) asymptotic
approximations to thesamplingdistributionof theWald,Ļ2, andscorestatisticsbreakdownwhen
thedata generationprocess is āclose toā the boundary,where at least one cell probability is zero.
This result isqualitativelysimilar toresults in [10],whichshowshowasymptoticapproximations to
thedistributionof themaximumlikelihoodestimate fail; forexample, in thecaseof logistic regression,
whentheboundary isclose in termsofdistancesasdeļ¬nedbytheFisher information.
Unlike statistics considered in Theorem 1, the deviance has a workable distribution in the
same limit: that is, for ļ¬xed N and k as we approach the boundary of the probability simplex.
Insharpcontrast to that theorem,wesee theverystableandworkablebehaviourof thek-asymptotic
approximation to thedistributionof thedeviance, inwhich thenumberofcells increaseswithout limit.
Deļ¬nethedevianceDvia
D/2 = ā{0ā¤iā¤k:ni>0}ni log(ni/N)ā k
ā
i=0 ni log(Ļi)
= ā{0ā¤iā¤k:ni>0}ni log(ni/μi),
whereμi :=E(ni)=NĻi. Wewill exploit the characterisation that themultinomial randomvector
(ni)has thesamedistributionasavectorof independentPoissonrandomvariablesconditionedon
their sum. Speciļ¬cally, let theelementsof (nāi )be independentlydistributedasPoissonPo(μi). Then,
Nā :=āki=0nāi ā¼Po(N),while (ni) :=(nāi |Nā=N)ā¼ Multinomial(N,(Ļi)). Deļ¬nethevector
Sā := (
Nā
Dā/2 )
= k
ā
i=0 (
nāi
nāi log(n ā
i/μi) )
,
327
Differential Geometrical Theory of Statistics
- Title
- Differential Geometrical Theory of Statistics
- Authors
- FrƩdƩric Barbaresco
- Frank Nielsen
- Editor
- MDPI
- Location
- Basel
- Date
- 2017
- Language
- English
- License
- CC BY-NC-ND 4.0
- ISBN
- 978-3-03842-425-3
- Size
- 17.0 x 24.4 cm
- Pages
- 476
- Keywords
- Entropy, Coding Theory, Maximum entropy, Information geometry, Computational Information Geometry, Hessian Geometry, Divergence Geometry, Information topology, Cohomology, Shape Space, Statistical physics, Thermodynamics
- Categories
- Naturwissenschaften Physik