Seite - 446 - in Differential Geometrical Theory of Statistics
Bild der Seite - 446 -
Text der Seite - 446 -
Entropy2016,18, 110
Ongoingworkof theauthor isconsideringasystematicanalysisof languagefamilies,basedon
theSSWLdatabaseofsyntacticparameters,usingthiscodingtheorytechnique. Thiswill includean
analysisofhowmuchtheconclusionsabout thespreadingofsyntacticparametersacross language
familiesobtainedwith this techniquedependsondatapre-processing like the removalof spoiling
featuresandwhatcanberetainedasanobjectivepropertyofasetof languages.Moreover,a further
purposeofthisongoingstudyistocombinethecodingtheoryapproachandthemeasuresofcomplexity
for groups of languages described in the present paperwith the spin glass dynamicalmodels of
language change considered in [8], which was aimed at studying dynamically the spreading of
syntacticparametersacrossgroupsof languages. Theaimis to introducecomplexitymeasuresbased
oncoding theoryaspartof theenergy landscapeof thespinglassmodel, followingthesuggestion
of [28], on analogies between the roles of complexity in the theory of computation and energy in
physical theories. Theseresults, alongwithamoredetailedanalysisof thecodesandcodeparameters
ofvarious languagefamilies,will appear in forthcomingwork.
2.6. ComparisonwithOtherBounds
Anotherpossiblequestiononecanconsiderinthissettingishowthecodesobtainedfromsyntactic
parametersofagivensetofnatural languagescomparewithotherknownfamiliesoferrorcorrecting
codesandwithotherbounds in thespaceofcodeparameters.
For instance, it isknownthatan important improvementover thebehaviorof typical random
codescanbeobtainedbyconsideringcodesdeterminedbyalgebro-geometric curvesdeïŹnedover
a ïŹnite ïŹeld Fq. Let Nq(X)=#X(Fq) be the number of points over Fq of the curve X, and let
Nq(g)=maxNq(X), with themaximum taken over all genus g curves X overFq. As shown in
Theorem2.3.22of [12], asymptotically theNq(g) satisfy theDrinfeldâVladutbound
A(q) := limsup
qââ Nq(g)
g â€âqâ1,
andasshowninSection3.4.1of [12], thisdeterminesanalgebro-geometricbound
αq(ÎŽ)â„RAG(ÎŽ)=1â 1A(q)âÎŽ
andtheasymptoticTsfasmanâVladutâZinkbound
αq(ÎŽ)â„RTVZ(ÎŽ)=1â(âqâ1)â1âÎŽ.
TheTsfasmanâVladutâZink lineRTVZ(ÎŽ) = 1â(âqâ1)â1âÎŽ lies entirelybelowtheGV line for
q<49 (Theorem3.4.4of [12]).
Aprobabilistic argument given in Section 3.4.2 of [12] shows that highly non-randomcodes
comingfromalgebraiccurvescanbeasymptoticallybetter thanrandomcodes (forsufïŹciently largeq)
as theyclusteraroundtheTVZline.However, forq=2orq=3,as in thecaseofcodes fromsyntactic
parametersofgroupsof languages thatweconsiderhere, theTVZline liesbelowtheGVline,hence
anyexample that liesabovetheGVboundalsobehavesbetter thanthe thealgebro-geometricbound.
Suchexamples, like theonegivenabove, for the three languagesArabic,Wolof,Basque,areveryrare
amongcodesobtainedfromsyntacticparametersof languages,as theyrequire thechoiceofagroup
of languages that are all very far fromeachother syntactically,withvery large relativeHamming
distancesbetweensyntacticparameters.
Ontheotherhand,evenforcasesofgroupsof languages forwhich theresultingcodeparameters
arebelowtheGVline, it is stillpossible toget someadditional informationbycomparingtheposition
of the code parameters to other curves obtained fromother bounds, such as the BlokhâZyablow
446
Differential Geometrical Theory of Statistics
- Titel
- Differential Geometrical Theory of Statistics
- Autoren
- Frédéric Barbaresco
- Frank Nielsen
- Herausgeber
- MDPI
- Ort
- Basel
- Datum
- 2017
- Sprache
- englisch
- Lizenz
- CC BY-NC-ND 4.0
- ISBN
- 978-3-03842-425-3
- Abmessungen
- 17.0 x 24.4 cm
- Seiten
- 476
- Schlagwörter
- Entropy, Coding Theory, Maximum entropy, Information geometry, Computational Information Geometry, Hessian Geometry, Divergence Geometry, Information topology, Cohomology, Shape Space, Statistical physics, Thermodynamics
- Kategorien
- Naturwissenschaften Physik