Web-Books
im Austria-Forum
Austria-Forum
Web-Books
Informatik
Document Image Processing
Seite - 139 -
  • Benutzer
  • Version
    • Vollversion
    • Textversion
  • Sprache
    • Deutsch
    • English - Englisch

Seite - 139 - in Document Image Processing

Bild der Seite - 139 -

Bild der Seite - 139 - in Document Image Processing

Text der Seite - 139 -

J. Imaging 2018,4, 15 0% 10% 20% 30% 40% 50% 60% 70% 80% 1 2 3 4 5 6 WER=43.2% CER=20.0% OOV WAR=9.3% n-gram size Word Error Rate Character Error Rate OOV Word Accuracy Rate Figure8.Resultsobtainedbydecodingat theHMMsub-word levelbyusingn-gramlanguagemodels withsizen={1,. . . ,6}. 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 WER=39.8% CER=17.6% OOV WAR=18.3% n-gram size Word Error Rate Character Error Rate OOV Word Accuracy Rate Figure9.Resultsobtainedbydecodingat theHMMcharacter levelbyusingn-gramlanguagemodels withsizen={1,. . . ,15}. Table 2. Overall best results on the Rodrigo test set in terms ofWER, CER andOOVWAR for theHMMsystem. Measure Word Sub-Word Character3-gram 4-gram 10-gram WER 43.9%±0.5 43.2%±0.5 39.8%±0.5 CER 21.2%±0.3 20.0%±0.3 17.6%±0.3 OOVWAR 2.3%±0.3 9.3%±0.7 18.3%±0.9 4.2. Studyof theRelationbetween theStructureof theOOVWordsandtheTrainingWords Thecharacter-basedapproachisabletorecognizesomeOOVwordsgiventhatthecharacter-based LMlearns thestructureof thewordscontainedinthe trainingset. Inorder toverify thishypothesis, wemeasuredtheperplexitypresentedbythebestcharacter-basedLM(10-gram)fordecodingeachone of the4918OOVwordsas theircorrespondingcharactersequences. Figure10presents theobtained perplexityperOOVwordseparated into twodistributions, recognizedandunrecognizedOOVwords. Table3summarizes themainfeaturesof thesedistributions.Asexpected, therecognizedOOVwords present lowerperplexity thantheunrecognizedOOVwords. Theoverlapofbothdistributionsmakes us thinkthat there is still roomfor improvementgiventhatmoreOOVwordscouldberecognized. 139
zurück zum  Buch Document Image Processing"
Document Image Processing
Titel
Document Image Processing
Autoren
Ergina Kavallieratou
Laurence Likforman-Sulem
Herausgeber
MDPI
Ort
Basel
Datum
2018
Sprache
deutsch
Lizenz
CC BY-NC-ND 4.0
ISBN
978-3-03897-106-1
Abmessungen
17.0 x 24.4 cm
Seiten
216
Schlagwörter
document image processing, preprocessing, binarizationl, text-line segmentation, handwriting recognition, indic/arabic/asian script, OCR, Video OCR, word spotting, retrieval, document datasets, performance evaluation, document annotation tools
Kategorie
Informatik
Web-Books
Bibliothek
Datenschutz
Impressum
Austria-Forum
Austria-Forum
Web-Books
Document Image Processing