Page - 107 - in Document Image Processing
Image of the Page - 107 -
Text of the Page - 107 -
J. Imaging 2018,4, 43
area, region,orwindow.Niblack’smethodproposeda local thresholdingcomputationbasedonthe
localmeanandlocal standarddeviationofa rectangular localwindowforeachpixelon the image.
Therectangularsliding localwindowwill cover theneighborhoodforeachpixel.Usingthisconcept,
Niblack’smethodwasreportedtooutperformmanythresholding techniquesandgaveoptimal results
formanydocumentcollections.However, there isstill adrawbackto thismethod. Itwas foundthat
Niblack’smethodworksoptimallyonlyon the text region,but isnotwell suited for largenon-text
regions of an image. The absence of text in local areas forcesNiblack’smethod todetect noise as
text. Thesuitablewindowsizeshouldbechosenbasedonthecharacterandstrokesize,whichmay
varyforeach image. Manyother localadaptivebinarizationtechniqueswereproposedto improve
theperformanceof thebasicNiblackmethod. Forexample,Sauvola’smethodisamodifiedversion
ofNiblack’smethod. Sauvola’smethodproposes a local binarization technique todealwith light
texture, largevariations, anduneven illumination. The improvementoverNiblack’smethod is in
theuseofadaptivecontributionofstandarddeviation indeterminingthe local thresholdonthegray
values of text andnon-text pixels. Sauvola’smethodprocesses the image inN×Nadjacent and
non-overlappingblocksseparately.
Wolf’smethodtriedtoovercometheproblemofSauvola’smethodwhenthegrayvaluesof text
andnon-textpixelsareclose toeachotherbynormalizing thecontrastandthemeangrayvalueof the
image tocompute the local threshold.However,asharpchange inbackgroundgrayvaluesacross the
imagedecreases theperformanceofWolf’smethod. Twoother improvements toNiblack’smethod
areNICKmethodand theRaismethod. NICKmethodproposesa thresholdcomputationderived
fromthebasicNiblack’smethodandtheRaismethodproposesanoptimal sizeofwindowfor the
localbinarization.
3.1.3. Training-BasedBinarization
Thetoptwoproposedmethods in theBinarizationChallengefor the ICFHR2016Competition
ontheAnalysisofHandwrittenText in ImagesofBalinesePalmLeafManuscriptsare training-based
binarizationmethods [25]. The bestmethod in this competition employs a Fully Convolutional
Network(FCN). It takesacolorsubimageas inputandoutputs theprobability thateachpixel in the
sub-image ispartof the foreground. TheFCNispre-trainedonnormalhandwrittendocument images
with automatically generated “ground truth” binarizations (using themethodofWolf et al. [46]).
TheFCNis thenfine-tunedusingDIBCOandHDIBCOcompetition imagesandtheircorresponding
groundtruthbinarizations. Finally, theFCNisfine-tunedagainontheprovidedBalinesepalmleaf
images. Consequently, thepixelprobabilitiesof foregroundareefficientlypredicted for thewhole
imageatonceandthresholdedat0.5 tocreateabinarizedoutput image.
Thesecond-bestmethoduses twoneuralnetworkclassifiers,C1 andC2, toclassifyeachpixelas
backgroundornot. Twobinarizedimages,B1 andB2,aregenerated in thisstep.C1 isaroughclassifier
that tries todetectall the foregroundpixels,whileprobablymakingmistakes forsomebackground
pixels. C2 isanaccurateclassifier that shouldnotclassifyabackgroundpixelasa foregroundpixelbut
probablymissessomeforegroundpixels. Secondly, these twobinary imagesare joinedtoget thefinal
classificationresult.
3.2. TextLineSegmentation
Text line segmentation is a crucial pre-processing step inmostDIApipelines. The task aims
at extractingandseparating text regions into individual lines. Most line segmentationapproaches
in the literature require that the input imagebebinarized. However, due to thedegradation and
noise often found inhistorical documents such as palm leafmanuscripts, the binarization task is
not able to produce good enough results (see Section 5.1). In this paper, we investigate two line
segmentationmethods thatare independentof thebinarization task. Theseapproachesworkdirectly
oncolor/grayscale images.
107
back to the
book Document Image Processing"
Document Image Processing
- Title
- Document Image Processing
- Authors
- Ergina Kavallieratou
- Laurence Likforman-Sulem
- Editor
- MDPI
- Location
- Basel
- Date
- 2018
- Language
- German
- License
- CC BY-NC-ND 4.0
- ISBN
- 978-3-03897-106-1
- Size
- 17.0 x 24.4 cm
- Pages
- 216
- Keywords
- document image processing, preprocessing, binarizationl, text-line segmentation, handwriting recognition, indic/arabic/asian script, OCR, Video OCR, word spotting, retrieval, document datasets, performance evaluation, document annotation tools
- Category
- Informatik