Seite - 118 - in Document Image Processing
Bild der Seite - 118 -
Text der Seite - 118 -
J. Imaging 2018,4, 43
4.4.2. EvaluationMethod
Theerror rate isdefinedbyeditdistancesbetweengroundtruthandrecognizeroutputand is
computedusingtheprovidedOCRopyfunctionocropus-errs (https://github.com/tmbdev/ocropy/
blob/master/ocropus-errs) [56].
5. ExperimentalResultsandDiscussion
In this section, the performance of eachmethod for theDIA tasks on palm leafmanuscript
collections ispresented.
5.1. Binarization
Theexperimental results for thebinarization taskarepresented inTable5. Theseresults show
that theperformanceofallmethodsoneachdataset is still quite low.Mostof themethodsachieve
less thana50%FMscore. Thismeans thatpalmleafmanuscriptsarestill anopenchallengefor the
binarizationtask. Thedifferentparametervalues for the localadaptivebinarizationmethodsshow
significant improvement inperformance,but still giveunsatisfactoryresults. In theseexperiments,
the ICFHRG1methodwasevaluatedfor theKhmerandSundanesedatasetsusing thepre-trained
Balinese trainingsetweightedmodel. Basedontheseexperiments,Niblack’smethodgives thehighest
FMscore forSundanesemanuscripts (Figure20), ICFHRG1methodgives thehighestFMscore for
Khmermanuscripts (Figure21),andICFHRG2gives thehighestFMscore forBalinesemanuscripts
(Figure22).However,visually, therearestillmanybrokenandunrecognizablecharacters/glyphs,and
noise isdetected in the images.
Figure20.BinarizationofSundanesemanuscriptwithNiblack’smethod.
Figure21.BinarizationofKhmermanuscriptwith ICFHRG1method.
118
zurück zum
Buch Document Image Processing"
Document Image Processing
- Titel
- Document Image Processing
- Autoren
- Ergina Kavallieratou
- Laurence Likforman-Sulem
- Herausgeber
- MDPI
- Ort
- Basel
- Datum
- 2018
- Sprache
- deutsch
- Lizenz
- CC BY-NC-ND 4.0
- ISBN
- 978-3-03897-106-1
- Abmessungen
- 17.0 x 24.4 cm
- Seiten
- 216
- Schlagwörter
- document image processing, preprocessing, binarizationl, text-line segmentation, handwriting recognition, indic/arabic/asian script, OCR, Video OCR, word spotting, retrieval, document datasets, performance evaluation, document annotation tools
- Kategorie
- Informatik