Web-Books
im Austria-Forum
Austria-Forum
Web-Books
Informatik
Document Image Processing
Seite - 114 -
  • Benutzer
  • Version
    • Vollversion
    • Textversion
  • Sprache
    • Deutsch
    • English - Englisch

Seite - 114 - in Document Image Processing

Bild der Seite - 114 -

Bild der Seite - 114 - in Document Image Processing

Text der Seite - 114 -

J. Imaging 2018,4, 43 Negative RateMetric (NRM): NRM is defined from the negative rate of false negative (NRFN) (Equation(6))andthenegativerateof falsepositive (NRFP) (Equation(7)): NRFN= FN FN+TP (6) NRFP= FP FP+TN (7) TN, defined as true negative, occurswhen both the image pixel and ground truth are labeled as background. ThedefinitionsofTP,FN, andFPare thesameas theonesgivenfor theF-Measure. NRM= NRFN+NRFP 2 (8) AlowerNRMindicatesabettermatch. 4.2. TextLineSegmentation 4.2.1.Datasets Thepalmleafmanuscriptdatasetsfortext linesegmentationtaskarepresentedinTable2. Thetext linesegmentationgroundtruthdata forBalineseandSundanesemanuscriptshavebeengenerated byhandbasedon thebinarizedground truth images [17]. ForKhmer1, a semi-automatic scheme isused [26,59]. A set ofmedialpoints for each text is generatedautomaticallyon thebinarization groundtruthof thepage image. Then thosepoints canbemovedupordownwitha tool tofit the skewandfluctuationof thereal text lines.Wealsonote touchingcomponentsspreadingovermultiple linesandthe locationswhere theycanbeseparated. ForKhmer2and3,anIDof the line itbelongs to isassociatedwitheachannotatedcharacter. Theregionofa text line is theunionof theareasof the polygonboundariesofall annotatedcharacterscomposing it [21,27]. Table2.Palmleafmanuscriptdatasets for text linesegmentationtask. Manuscripts Pages TextLines Dataset Balinese1 35pages 140 text lines ExtractedfromAMADI_LontarSet [17,26,40] Balinese2 Bali-2.1: 47pagesBali-2.2: 49pages 181 text lines 182 text lines ExtractedfromAMADI_LontarSet [17] Khmer1 43pages 191 text lines ExtractedfromEFEO[20,26,59] Khmer2 100pages 476 text lines ExtractedfromSleukRithSet [21,27] Khmer3 200pages 971 text lines ExtractedfromSleukRithSet [21] Sundanese1 12pages 46 text lines ExtractedfromSundaDataset [26] Sundanese2 61pages 242 text lines ExtractedfromSundaDataset [22] 4.2.2. EvaluationMethod Followingourpreviouswork[26],weusetheevaluationcriteriaandtoolprovidedbyICDAR2013 HandwritingSegmentationContest [61]. First, theone-to-one (o2o)matchscore is computed fora regionpairbasedontheevaluator’sacceptance threshold. Inourexperiments,weused90%as the acceptance threshold. LetNbethecountofgroundtruthelements, andMthecountof resultelements. Withtheo2oscore, threemetricsarecalculated: detectionrate (DR), recognitionaccuracy(RA),and performancemetric (FM). 4.3. IsolatedCharacter/GlyphRecognition 4.3.1.Datasets Thepalmleafmanuscriptdatasets for isolatedcharacter/glyphrecognitiontaskarepresented inTable3. For theBalinesecharacterdataset,Balinesephilologistsmanuallyannotatedthesegment 114
zurück zum  Buch Document Image Processing"
Document Image Processing
Titel
Document Image Processing
Autoren
Ergina Kavallieratou
Laurence Likforman-Sulem
Herausgeber
MDPI
Ort
Basel
Datum
2018
Sprache
deutsch
Lizenz
CC BY-NC-ND 4.0
ISBN
978-3-03897-106-1
Abmessungen
17.0 x 24.4 cm
Seiten
216
Schlagwörter
document image processing, preprocessing, binarizationl, text-line segmentation, handwriting recognition, indic/arabic/asian script, OCR, Video OCR, word spotting, retrieval, document datasets, performance evaluation, document annotation tools
Kategorie
Informatik
Web-Books
Bibliothek
Datenschutz
Impressum
Austria-Forum
Austria-Forum
Web-Books
Document Image Processing