Web-Books
in the Austria-Forum
Austria-Forum
Web-Books
Informatik
Document Image Processing
Page - 132 -
  • User
  • Version
    • full version
    • text only version
  • Language
    • Deutsch - German
    • English

Page - 132 - in Document Image Processing

Image of the Page - 132 -

Image of the Page - 132 - in Document Image Processing

Text of the Page - 132 -

J. Imaging 2018,4, 15 Table1. Descriptionof thepartitionsof theRodrigocorpususedin thiswork. Partition Lines Words Sub-Words CharactersTotal /Diff./OOV(over.) Total/Diff./OOV(over.) Total/Diff./OOV(over.) Training 9000 98,232/12,650/- 148,070/3045/- 493,126/105/- Validation 1000 10,899/3016/850 14,907/1074/7 54,936/82/1 Test 5010 55,195/7453/4918(203) 73,660/1418/55(11) 272,132/91/14(1) 3.HandwrittenTextRecognitionSystems Thissectionpresentsourproposal, the featureextraction, themodelsusedbythe implemented HTRsystemsandtheevaluationmetricsused in theexperimentation. 3.1. Proposal TheHTRproblemcanbeformulatedasfindingthemost likelywordsequence wˆgivenafeature vectorsequencex=(x1,x2, . . . ,x|x|) that representsahandwritten text line image[21], that is: wˆ=argmax w∈W Pr(w | x)=argmax w∈W Pr(x |w)Pr(w) Pr(x) =argmax w∈W Pr(x |w)Pr(w) (1) whereW represents thesetofallpermissiblewordsequences,Pr(x) is theprobabilityofobservingx, Pr(w) is theprobabilityof thewordsequencew=(w1,w2, . . . ,w|w|)andPr(x |w) is theprobability ofobservingxbyassumingthatw is theunderlyingwordsequence forx. Pr(w) isapproximatedby theLanguageModel (LM),whereasPr(x |w) ismodeledbytheopticalmodel,which trainscharacter modelsandconcatenates themtobuildopticalwordorsub-wordmodels. Writtenwordscanbedecomposedintosmall sub-wordunits suchascharacters,but theycanalso bedecomposedinto largersub-wordunitssuchasgraphemicsyllables,hyphensormultigrams[15]. Wechoosehere to compare character andhyphenworddecompositions. Inboth cases,wordsare represented as a sequence of sub-wordunits s = (s1,s2, . . . ,s|s|). Then, theHTRproblemcanbe reformulatedasfindingthemost likelysub-wordsequence sˆgivenafeaturevectorsequencex that representsahandwritten text image. Therefore,Equation(1)becomes: sˆ=argmax s∈S Pr(x | s)Pr(s) (2) where Pr(s) is approximated by a sub-wordLM,whereas Pr(x | s) can bemodeled by the same opticalmodel. It should be noted that RNN-based systems directly provide in their outputs posterior distributionsof character labels, at each timestep, i.e., otk for k= 1,. . . ,Land t= 1,. . . ,T,Tbeing the lengthof theobservationsequencexandL thealphabetsize. Fromtheseposteriors, thedecoding canbeconstrainedbya lexiconanda languagemodel, inorder tofind thebestoutput sequence sˆ. ThiscanbedonethroughWeightedFiniteStateTransducers (WFST)decoding(seeSection3.5),which can includeseveral typesof lexiconandlanguagemodels (atword,hyphenorcharacter levels). Workingat thesub-wordlevel inHTRrelaxes therestrictions imposedbythe lexicon,allowing fora fasterdecoding, andgiven that the languagemodeldescribes the relationbetweensub-word units, someOOVwordscanbedecoded. Therefore,ourproposal is todecodethehandwritten text line imagesat thesub-word leveland, then, fromtheobtaineddecodingoutput, reconstruct thewords tobuild thefinalhypothesis. Firstofall, the languagemodelof sub-wordunits is trainedusing the transcriptionof the text lines of the training partition after a minimum preprocessing. This preprocessing consists of adding a new symbol (<SPACE>) for the separation betweenwords and then splitting thewords intosub-wordsequences. In thisway, the informationof theseparationbetweenwords ismaintained. Asanexample, the followingtext line fromthetrainingset: 132
back to the  book Document Image Processing"
Document Image Processing
Title
Document Image Processing
Authors
Ergina Kavallieratou
Laurence Likforman-Sulem
Editor
MDPI
Location
Basel
Date
2018
Language
German
License
CC BY-NC-ND 4.0
ISBN
978-3-03897-106-1
Size
17.0 x 24.4 cm
Pages
216
Keywords
document image processing, preprocessing, binarizationl, text-line segmentation, handwriting recognition, indic/arabic/asian script, OCR, Video OCR, word spotting, retrieval, document datasets, performance evaluation, document annotation tools
Category
Informatik
Web-Books
Library
Privacy
Imprint
Austria-Forum
Austria-Forum
Web-Books
Document Image Processing