Web-Books
im Austria-Forum
Austria-Forum
Web-Books
Informatik
Document Image Processing
Seite - 143 -
  • Benutzer
  • Version
    • Vollversion
    • Textversion
  • Sprache
    • Deutsch
    • English - Englisch

Seite - 143 - in Document Image Processing

Bild der Seite - 143 -

Bild der Seite - 143 - in Document Image Processing

Text der Seite - 143 -

J. Imaging 2018,4, 15 Theresultsobtainedwith theRNNsystemusingcharactern-gramLMarepresented inFigure16. As in thecharacter-basedHMMexperiments, similar resultsareobtainedforn≥6,andtheoverall best resultwasobtainedwitha10-gramcharacter languagemodel: aWERequal to37.7%±0.5,aCER equal to14.3%±0.3andanOOVWARequal to37.8%±1.1. 0% 20% 40% 60% 80% 100% 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 WER=37.7% CER=14.3% OOV WAR=37.8% n-gram size Word Error Rate Character Error Rate OOV Word Accuracy Rate Figure16.ResultsobtainedbytheRNNcharacter-basedsystemusingn-gramlanguagemodels. Asummaryof theobtainedbest results for the testexperiments for theRNNsystemispresented inTable 4. As canbeobserved, generally, theRNNapproachperformsbetter than the traditional HMMapproach.Althoughtheuseof theword-basedRNNsystemobtainsastatistically-significant relative deterioration of 19.6% over the HMM system (43.9%± 0.5) in terms of WER, 18.9% statistically-significant relative improvement in terms of CER (21.2%±0.3) can be considered. Moreover, 16.3% of OOV words, which correspond to words followed by punctuation marks, arewell recognized. Table4. Summaryof thebest results in termsofWER,CERandOOVWARfor theRNNsystem. Measure Word Sub-Word Character2-gram 5-gram 10-gram WER 52.5%±0.8 38.6%±0.5 37.7%±0.5 CER 17.2%±0.3 17.3%±0.3 14.3%±0.3 OOVWAR 16.3%±0.9 27.4%±1.1 37.8%±1.1 Theuseofsub-wordunitsoffersbetterresults thanusingwords,allowingonetoobtainsignificant improvements intermsofWERandCERovertheHMMsystem. Inthiscase, theuseofafive-gramLM trainedwithhyphenatedwordsallowedobtainingstatistically-significant improvementsat theWER levelover theuseofa two-gramLMof fullwords.However,as for theHMMsystem, theoverallbest resultsareobtainedbyusingthecharacter-basedapproach: aWERequal to37.7%±0.5,aCERequal to14.3%±0.3andanOOVWARequal to37.8%±1.1. 4.4.2. Results forDeepModelsBasedonConvolutionalRecurrentNeuralNetworks Figure 17presents the recognition results obtained for theword-basedCRNNsystem. As in the previousword-based systems, the recognizedOOVwords correspond towords attached to punctuation marks, which were correctly recognized after removing the space between them (see the example presented in Figure A2). The best result, obtained by using a three-gramLM, presentsaWERequal to17.9%±0.4,aCERequal to4.0%±0.1andanOOVWARequal to21.5%±1.0. The results obtainedusingsub-wordn-gramLMare shown inFigure18. Thebest resultwas obtainedwithafour-gramlanguagemodel(aWERequal to14.8%±0.3andaCERequal to3.4%±0.1). 143
zurück zum  Buch Document Image Processing"
Document Image Processing
Titel
Document Image Processing
Autoren
Ergina Kavallieratou
Laurence Likforman-Sulem
Herausgeber
MDPI
Ort
Basel
Datum
2018
Sprache
deutsch
Lizenz
CC BY-NC-ND 4.0
ISBN
978-3-03897-106-1
Abmessungen
17.0 x 24.4 cm
Seiten
216
Schlagwörter
document image processing, preprocessing, binarizationl, text-line segmentation, handwriting recognition, indic/arabic/asian script, OCR, Video OCR, word spotting, retrieval, document datasets, performance evaluation, document annotation tools
Kategorie
Informatik
Web-Books
Bibliothek
Datenschutz
Impressum
Austria-Forum
Austria-Forum
Web-Books
Document Image Processing