Web-Books
in the Austria-Forum
Austria-Forum
Web-Books
Informatik
Document Image Processing
Page - 106 -
  • User
  • Version
    • full version
    • text only version
  • Language
    • Deutsch - German
    • English

Page - 106 - in Document Image Processing

Image of the Page - 106 -

Image of the Page - 106 - in Document Image Processing

Text of the Page - 106 -

J. Imaging 2018,4, 43 carvingmethod is evaluated for the text line segmentation task, compared to a recent text line segmentationmethodforpalmleafmanuscripts [27]. For the isolatedcharacter/glyphrecognition task, theevaluation is reportedfromthehandcraftedfeatureextractionmethod, theneuralnetwork withunsupervised learningfeature to theCNNbasedmethod. Finally, theRNN-LSTMbasedmethod isusedtoanalyze thewordrecognitionandtransliterationtask forpalmleafmanuscripts. 3.1. Binarization Binarization iswidelyappliedas thefirstpre-processingstep in imagedocumentanalysis [34]. Binarization isacommonstartingpoint fordocument imageanalysisandconvertsgray imagevalues intobinary representation forbackgroundandforeground,or,more specifically, text andnon-text, which is thenfedinto furtherdocumentprocessingtaskssuchas text linesegmentationandoptical character recognition. Theperformanceofbinarization techniqueshasagreat impact anddirectly affects the performance of the recognition task [35]. Non-optimal binarizationmethods produce unrecognizable characterswithnoise [16]. Manybinarizationmethodshavebeen reported. These methodshavebeen testedandevaluatedondifferent typesofdocument collections. Basedon the choice of the thresholding value, binarizationmethods can generally be divided into two types, globalbinarizationand local adaptivebinarization [16]. Somesurveysandcomparative studiesof theperformanceofseveralbinarizationmethodshavebeenreported[35,36].Abinarizationmethod thatperformswell foronedocumentcollectionmaynotnecessarilybeappliedtoanotherdocument collectionwith the same performance [34]. For this reason, there is always a need to perform a comprehensiveevaluationof theexistingbinarizationmethods foranewdocumentcollection thathas differentcharacteristics, forexample thehistoricalarchivedocuments [36]. In thiswork,wecomparedseveralalternativebinarizationalgorithmsforpalmleafmanuscripts. We testedandevaluatedsomewell-knownstandardbinarizationmethods, andsomebinarization methodsthatareexperimentallypromisingforhistoricalarchivedocuments, thoughnotspecificallyfor imagesofpalmleafmanuscripts.Wealso testedthebinarizationmethodsfromtheDocument Image BinarizationCompetition (DIBCO)competition [37,38], forexampleHowe’smethod[39]andtheones from the InternationalConferenceonFrontiers inHandwritingRecognition (ICFHR) competition (amadi.univ-lr.fr/ICFHR2016_Contest) [25,40]. 3.1.1.GlobalThresholding Global thresholding is the simplest technique and the most conventional approach for binarization[34,41].Asinglethresholdvaluewascalculatedfromtheglobalcharacteristicsoftheimage. Thisvalueshouldbeproperlychosenbasedonaheuristic techniqueorastatisticalmeasurement to beable togivepromisingoptimalbinarization results [36]. It iswidelyknownthatusingaglobal thresholdtoprocessabatchofarchive imageswithdifferent illuminationandnoisevariation isnota properchoice. Thevariationbetween images in the foregroundandbackgroundcolorson low-quality document imagesgivesunsatisfactoryresults. It isdifficult tochooseonefixedthresholdvalue that is adaptable forall images [36,42]. Otsu’smethod is a very popular global binarization technique [34,41]. Conceptually, Otsu’s methodtries tofindanoptimumglobal thresholdonanimagebyminimizingtheweightedsumof variancesof theobjects andbackgroundpixels [34]. Otsu’smethod is implementedas a standard binarizationtechniqueinabuilt-inMatlabfunctioncalledgraythresh (https://fr.mathworks.com/help/ images/ref/graythresh.html) [43]. 3.1.2. LocalAdaptiveBinarization To overcome the weakness of the global binarization technique, many local adaptive binarization techniqueswere proposed, for exampleNiblack’smethod [34,36,41,42,44], Sauvola’s method [34,36,41,42,44,45],Wolf’smethod [42,44,46],NICKmethod [44], and theRaismethod [34]. The thresholdvalue in localadaptivebinarizationtechnique iscalculated ineachsmaller local image 106
back to the  book Document Image Processing"
Document Image Processing
Title
Document Image Processing
Authors
Ergina Kavallieratou
Laurence Likforman-Sulem
Editor
MDPI
Location
Basel
Date
2018
Language
German
License
CC BY-NC-ND 4.0
ISBN
978-3-03897-106-1
Size
17.0 x 24.4 cm
Pages
216
Keywords
document image processing, preprocessing, binarizationl, text-line segmentation, handwriting recognition, indic/arabic/asian script, OCR, Video OCR, word spotting, retrieval, document datasets, performance evaluation, document annotation tools
Category
Informatik
Web-Books
Library
Privacy
Imprint
Austria-Forum
Austria-Forum
Web-Books
Document Image Processing