Web-Books
im Austria-Forum
Austria-Forum
Web-Books
Naturwissenschaften
Physik
Differential Geometrical Theory of Statistics
Seite - 445 -
  • Benutzer
  • Version
    • Vollversion
    • Textversion
  • Sprache
    • Deutsch
    • English - Englisch

Seite - 445 - in Differential Geometrical Theory of Statistics

Bild der Seite - 445 -

Bild der Seite - 445 - in Differential Geometrical Theory of Statistics

Text der Seite - 445 -

Entropy2016,18, 110 TheGilbert–Varshamovcurvecanbecharacterizedintermsofthebehaviorofsufficientlyrandom codes, in thesenseof theShannonRandomCodeEnsemble, see [26,27],while theasymptoticbound canbecharacterized in termsofKolmogorovcomplexity, see [16]. 2.5. CodeParametersofLanguageFamilies Fromthecodingtheoryviewpoint, it isnatural toaskwhether therearecodesC, formedoutofa choiceofacollectionofnatural languagesandtheir syntacticparameters,whosecodeparameters lie abovetheasymptoticboundcurveR=α2(δ). For instance,acodeCwhosecodeparametersviolate thePlotkinbound(5)mustbean isolated codeabovetheasymptoticbound.ThismeansconstructingacodeCwithδ≥1/2, that is, suchthat anypairofcodewordsw =w′ ∈Cdifferbyat leasthalfof theparameters.Adirectexaminationof the listofparameters inTableAof [3]andFigure7of [4] showsthat it isverydifficult tofind,within thesamehistorical linguistic family (e.g., the Indo-Europeanfamily)pairsof languagesL1,L2with δH(L1,L2)≥ 1/2. For example, among the syntactic relativedistances listed inFigure7of [4] one findsonly thepair (Farsi,Romanian)witha relativedistanceof 0.5. Otherpairs comeclose to this value, forexampleFarsiandFrenchhavearelativedistanceof0.483,butFrenchandRomanianonly differby0.162. Onehasbetterchancesofobtainingcodesabovetheasymptoticboundifonecompares languages thatarenotsocloselyrelatedat thehistorical level. Example 2. Consider the set C = {L1,L2,L3}with languages L1 = Arabic, L2 = Wolof, and L3=Basque.Weexcludefromthe listofTableAof [3]all thoseparameters thatareentailedandmade irrelevantbysomeotherparameter inat leastoneof these threechosen languages. Thisgivesusa list of25remainingparameters,whichare thosenumberedas1–5,7, 10,20–21,25,27–29,31–32,34,37,42, 50–53,55–57 in [3], andthe followingthreecodewords: L1 1 1 1 1 1 1 0 1 0 1 0 1 0 1 1 1 1 1 1 0 1 0 0 0 0 L2 1 1 1 0 0 1 1 0 1 0 1 0 0 1 0 1 1 0 0 1 1 1 1 1 1 L3 1 1 0 1 0 0 1 0 0 0 1 1 1 0 1 1 0 1 1 1 1 1 1 0 0 This example, although very simple and quite artificial in the choice of languages, already suffices toproduceacodeC that liesabovetheasymptoticbound. In fact,wehavedH(L1,L2)=16, dH(L2,L3)=13anddH(L1,L3)=13, so thatδ=0.52. SinceR>0, thecodepoint (δ,R)violates the Plotkinbound,hence it liesabovetheasymptoticbound. Itwouldbemore interesting tofinda codeC consisting of languages belonging to the same historical-linguistic family (outsideof the Indo-Europeangroup), that liesabovetheasymptoticbound. Suchexampleswouldcorrespond to linguistic families that exhibit avery strongvariabilityof the syntacticparameters, inawaythat isquantifiable throughthepropertiesofCasacode. Ifoneconsiders the22 Indo-European languages in [3]with theirparameters,oneobtainsacode C that isbelowtheGilbert–Varshamovline,hencebelowtheasymptoticboundbyEquation(8).Afew otherexamples, takenfromothernonIndo-Europeanhistorical-linguistic families, computedusing thoseparameters reported in theSSWLdatabase (forexample thesetofMalayo–Polynesian languages currentlyrecordedinSSWL)alsogivecodeswhosecodeparameters liebelowtheGilbert–Varshamov curve. One can conjecture that any codeC constructedout of natural languages belonging to the same historical-linguistic family will be below the asymptotic bound (or perhaps below the GV bound),whichwouldprovideaquantitativeboundonthepossible spreadofsyntacticparameters within a historical family, given the size of the family. Examples like the simple one constructed above,using languagesnotbelongingto thesamehistorical familyshowthat, to thecontrary,across different historical families one encounters a greater variability of syntactic parameters. To our knowledge,nosystematic studyofparametervariability fromthiscodingtheoryperspectivehasbeen implementedsofar. 445
zurück zum  Buch Differential Geometrical Theory of Statistics"
Differential Geometrical Theory of Statistics
Titel
Differential Geometrical Theory of Statistics
Autoren
Frédéric Barbaresco
Frank Nielsen
Herausgeber
MDPI
Ort
Basel
Datum
2017
Sprache
englisch
Lizenz
CC BY-NC-ND 4.0
ISBN
978-3-03842-425-3
Abmessungen
17.0 x 24.4 cm
Seiten
476
Schlagwörter
Entropy, Coding Theory, Maximum entropy, Information geometry, Computational Information Geometry, Hessian Geometry, Divergence Geometry, Information topology, Cohomology, Shape Space, Statistical physics, Thermodynamics
Kategorien
Naturwissenschaften Physik
Web-Books
Bibliothek
Datenschutz
Impressum
Austria-Forum
Austria-Forum
Web-Books
Differential Geometrical Theory of Statistics