Page - 421 - in Differential Geometrical Theory of Statistics
Image of the Page - 421 -
Text of the Page - 421 -
Entropy2016,18, 425
7.2. PriorsandLow-RankEstimation
The low-rank cometric formulationpursued in Section 5gives a natural restriction of (21) to
u∈ FkM, 1≤ k≤ d. As forEuclideanPCA,mostvariance isoftencaptured in thespanof thefirst
k eigenvectorswith k d. Estimates of the remaining eigenvectors are generally ignored, as the
varianceof theeigenvectorestimates increasesasthenoisecapturedinthespanof the lasteigenvectors
becomes increasingly uniform. The low-rank cometric restricts the estimation to only the first k
eigenvectors, andthusbuilds theconstructiondirectly into themodel. Inaddition, itmakesnumerical
implementationfeasible,becauseanumerical representationneedonlystoreandevolved×kmatrices.
Asadifferentapproachforregularizingtheestimator (21), thenormalizingterm−N log(detgRu)can
beextendedwithotherpriors (e.g., anL1-typepenalizing term). Suchpriors canpotentiallypartly
removeexistenceanduniqueness issues,andresult inadditional sparsityproperties thatcanbenefit
numerical implementations. Theeffectsof suchpriorshaveyet tobe investigated.
In the k= d case, thenumberofdegreesof freedomfor theMPPsgrowsquadratically in the
dimensiond. Thisnaturally increases thevarianceofanyMPPestimategivenonlyonesample fromits
trajectory. Thelow-rankcometric formulationreducesthegrowthtolinear ind. Thenumberofdegrees
of freedomishoweverstillk times larger thanforRiemanniangeodesics.With longitudinaldata,more
samplesper trajectorycanbeobtained, reducing thevarianceandallowingabetterestimateof the
MPP.However, for theestimators (20)and(21)above,estimatesof theactualoptimalMPPsarenot
needed—only their squared length. It canbehypothesizedthat thevarianceof the lengthestimates is
lower thanthevarianceof theestimatesof thecorrespondingMPPs. Further investigationregarding
thiswillbe thesubjectof futurework.
7.3. Conclusions
Theunderlyingmodelofanisotropyusedinthispaperoriginates fromtheanisotropicnormal
distributions formulated in [2]andthediffusionPCAframework[1]. Becausemanystatisticalmodels
aredefinedusingnormaldistributions, thisapproachto incorporatinganisotropyextends tomodels
suchaslinearregression.Weexpect thatfindingmostprobablepaths inotherstatisticalmodelssuchas
regressionsmodelscanbecarriedoutwithaprogramsimilar to theprogrampresented in thispaper.
ThedifferencebetweenMPPsandgeodesicsshowsthat thegeometricandmetricpropertiesof
geodesics, zero acceleration, and local distanceminimization are not directly related to statistical
properties such as maximizing path probability. Whereas the concrete application and model
determines ifmetricorstatisticalpropertiesare fundamental,moststatisticalmodelsare formulated
withoutreferringtometricpropertiesof theunderlyingspace. It canthereforebearguedthat thedirect
incorporationofanisotropyandtheresultingMPPsarenatural in thecontextofmanymodelsofdata
variation innon-linerspaces.
Acknowledgments: The authorwishes to thank PeterW.Michor and Sarang Joshi for suggestions for the
geometric interpretation of the sub-Riemannian metric on FM and discussions on diffusion processes on
manifolds. The work was supported by the Danish Council for Independent Research, the CSGB Centre
for StochasticGeometry andAdvancedBioimaging fundedby agrant from theVillum foundation, and the
ErwinSchrödinger Institute inVienna.
Conflictsof Interest:Theauthordeclaresnoconflictof interest.
References
1. Sommer,S.DiffusionProcessesandPCAonManifolds.Availableonline: https://www.mfo.de/document/
1440a/OWR_2014_44.pdf (accessedon24November2016).
2. Sommer, S. Anisotropic distributions on manifolds: Template estimation and most probable paths.
In InformationProcessing inMedical Imaging;LectureNotes inComputerScience;Springer: Berlin/Heidelberg,
Germany,2015;Volume9123,pp. 193–204.
3. Sommer,S.;Svane,A.M.Modellinganisotropiccovarianceusingstochasticdevelopmentandsub-riemannian
framebundlegeometry. J.Geom.Mech. 2016, inpress.
421
Differential Geometrical Theory of Statistics
- Title
- Differential Geometrical Theory of Statistics
- Authors
- Frédéric Barbaresco
- Frank Nielsen
- Editor
- MDPI
- Location
- Basel
- Date
- 2017
- Language
- English
- License
- CC BY-NC-ND 4.0
- ISBN
- 978-3-03842-425-3
- Size
- 17.0 x 24.4 cm
- Pages
- 476
- Keywords
- Entropy, Coding Theory, Maximum entropy, Information geometry, Computational Information Geometry, Hessian Geometry, Divergence Geometry, Information topology, Cohomology, Shape Space, Statistical physics, Thermodynamics
- Categories
- Naturwissenschaften Physik