Seite - 394 - in Differential Geometrical Theory of Statistics

Bild der Seite - 394 -

Text der Seite - 394 -

Entropy2016,9, 337 as the cost ofmoving the smoothed density around γ1 to the uniformdistribution on the curve, thenmoving γ1 to γ2, keepingpointswith equal scaled arclength in correspondence, andﬁnally, movingtheuniformdistributiononγ2 to thesmootheddensity. Havingthedensityathand, theentropyof thesystemofcurvesγ1, . . . ,γN isdeﬁnedtheusual wayas: E(γ1, . . . ,γN)=− ∫ Ω d˜(x) log ( d˜(x) ) dx. The entropy isdependenton theparticular choiceof thekernelK. Asmentionedbefore, it is acommonpractice in theﬁeldofnon-parametricstatistics to introduceatuningparameterν>0 in thekernel, calledbandwidth, so that it is expressedasa scaledversionK= fν of agiven function f :R+→R+. Thevalueofν is themost inﬂuentialparameter intheestimationofthedensityandmust beselectedcarefully. Forcurveclusteringapplications, it isdeﬁnedbythedesired interaction length: ifν tends tozero, thecurveswillbehaveas independentobjects,whileontheotherendof thescale, veryhighbandwidthwill tendtoremovethe inﬂuenceof thecurves themselves. For themoment,no automatedmeansofﬁndinganoptimalνwasused,althoughitwillbepartofa futurework. 2.4.Minimizing theEntropy In order to fulﬁll the initial requirement of ﬁnding bundles of curve segments as straight as possible, one seeks after the system of curves minimizing the entropy E(γ1, . . . ,γN), orequivalentlymaximizing: ∫ Ω d˜(x) log ( d˜(x) ) dx. The reasonwhy this criterion gives the expected behaviorwill becomemore apparent after derivationof itsgradientat theendof thispart.Nevertheless,whenconsideringasingle trajectory, it is intuitivethat themostconcentrateddensitydistributionisobtainedwithastraightsegmentconnecting theendpoints: thispointwillbemaderigorous later. Letting beaperturbationof thecurveγj, suchthat (0)= (1)=0, theﬁrstorderexpansion of−E(γ1, . . . ,γN)willbecomputed inorder togetamaximizingdisplacementﬁeld, analogous to a gradient ascent (the choice has beenmade tomaximize the opposite of the entropy, so that the algorithmwillbeagradientascentone) in theﬁnitedimensional setting. Thenotation: ∂F ∂γj willbeused in thesequel todenote thederivativeofa functionFof thecurveγj in thesense that fora perturbation : F(γj+ )=F(γj)+ ∂F ∂γj ( )+o(‖ ‖2). Firstofall,pleasenote thatsince d˜has integraloneover thedomainΩ: ∫ Ω ∂d˜(x) ∂γj ( )dx=0 so that: − ∂ ∂γj E(γ1, . . . ,γN)( )= ∫ Ω ∂d˜(x) ∂γj ( ) log ( d˜(x) ) dx. (14) Starting fromtheexpressionof d˜given inEquation(7), theﬁrstorderexpansionof d˜withrespect to theperturbation ofγj isobtainedasasumofa termcomingfromthenumerator:∫ 1 0 K (‖x−γj(t)‖)‖γ′j(t)‖dt. (15) 394

zurück zum Buch Differential Geometrical Theory of Statistics"

Differential Geometrical Theory of Statistics

Titel: Differential Geometrical Theory of Statistics
Autoren: Frédéric Barbaresco; Frank Nielsen
Herausgeber: MDPI
Ort: Basel
Datum: 2017
Sprache: englisch
Lizenz: CC BY-NC-ND 4.0
ISBN: 978-3-03842-425-3
Abmessungen: 17.0 x 24.4 cm
Seiten: 476
Schlagwörter: Entropy, Coding Theory, Maximum entropy, Information geometry, Computational Information Geometry, Hessian Geometry, Divergence Geometry, Information topology, Cohomology, Shape Space, Statistical physics, Thermodynamics
Kategorien: Naturwissenschaften Physik