Seite - 46 - in Document Image Processing
Bild der Seite - 46 -
Text der Seite - 46 -
J. Imaging 2018,4, 57
recognition, retrieval and spotting [10]. Thus, the separation of text andnon-text in handwritten
documents iscomparablycomplex than inprinteddocuments.
Mostly, thereportedsolutionstotheproblemoftextandnon-textseparationaredoneeitherat the
region level [4]orat theconnectedcomponent (CC) level [5,6].Methods that implement text/non-text
separationat theregion level initiallyperformregionsegmentationandthenclassifyeachsegmented
regionaseitheratextorgraphicsregion. Forclassifyingthesegmentedregions,researchershavemostly
usedtexturebasedfeatures likeGrayLevelCo-occurrenceMatrix (GLCM)[4,11]Run-lengthbased
features [12,13]orwhite tilesbasedfeatures [14].However, regionsegmentationbasedmethodsare
verysensitivetothesegmentationresults. Poorsegmentationcancauseasignificantdegradationinthe
classificationresult.Ontheotherhand,asCCbasedmethodsworkat thecomponent level, theydonot
suffer fromsuchaproblem.Methods that followaCCbasedapproachuseshape-basedfeatures [5,6].
Ingeneral,methodsreported in this literature for text/non-text separation inhandwrittendocuments
havemostly followed theCCbasedapproach [7,8]. It isworthmentioninghere that, ashistorical
handwrittenmanuscriptssuffer fromvariousqualitydegradation issues, techniques likebinarization
andCCextractionbecomeveryerrorprone. Thus, insomerecentarticles [15–18], researchershave
followedapixelbasedapproach,whichavoids thebinarizationandCCextractionsteps.
From the available research work on this topic, it can be observed that texture features
likeGLCM(GrayLevelCo-occurrencematrix) [4,11], Run-length encodingbased features [12,13],
Black-and-white transitionalmatrixbasedfeature [19]havebeencommonlyusedbyresearchers to
solve the text/non-text separationproblemforprinteddocuments, aswellas toseparatehandwritten
andprintedtext sections indocuments [20]. Inarecentwork[8], aRotationInvariantUniformLocal
BinaryPattern (RIULBP)operatorhasalsobeenusedsuccessfully toseparate the textandnon-text
components inhandwrittenclass-notes. Texture featureshaveproventobeveryuseful in thefield
of text/non-textseparationdueto thefact that text regionsandgraphicsregions inmostcaseshave
verydifferentpatterns,whichcanbeexploitedtodifferentiatebetweenthem.Motivatedbythis fact,
in thepresentwork,wehaveattemptedtoevaluate theperformanceofdifferentLocalBinaryPattern
(LBP)based texture features to classify the componentspresent inhandwrittendocuments as text
ornon-text.
Thekeycontributionsofourpaperareas follows:
1. Wehavegivenadetailedanalysisofhowaccurately featuresextractedbydifferentvariantsof the
LBPoperator fromhandwrittendocument imageshelp indifferentiatingtextcomponents from
non-textones,which isoneof themostchallengingresearchareas in thedomainofdocument
imageprocessing. For thatpurpose,wehaveconsideredfivevariantsofLBP[21],namely, the
basicLBP[22], improvedLBP[23], rotation invariantLBP[22],uniformLBP[22], androtation
invariantanduniformLBP[22].
2. Thecontentsof thedataset,usedhere forevaluation,havecomplex textandnon-textcomponents
aswell asvariations in termsof scripts, aswehaveconsideredbothBanglaandEnglish texts.
Inadditionto that, someof thedocumentshavehandwrittenaswellasprintedtexts.
3. Wehavealsomadeaminoralteration torobustLBP[24] inorder todeveloprobustanduniform
LBP.Amethod todetermine the appropriate threshold valueused in this variant of LBP for
handwrittendocumentshasalsobeenproposed.
2. LocalBinaryPatternsandItsVariants
LBPwasfirst introduced byOjala [25,26], as a computationally simple texture operator in a
monochrometexture image.
ThegeneralizeddefinitionofLBP,given in[22],usedMsamplepointsevenlyplacedonacircle
of radiusRwith itscenterpositionedat (xcen,ycen). Theposition (xp,yp)of theneighboringpointp,
where p∈0,1, ...,M−1 isgivenby
(xp,yp)=(xcen+Rcos(2πp/M),ycen−Rsin(2πp/M)). (1)
46
zurück zum
Buch Document Image Processing"
Document Image Processing
- Titel
- Document Image Processing
- Autoren
- Ergina Kavallieratou
- Laurence Likforman-Sulem
- Herausgeber
- MDPI
- Ort
- Basel
- Datum
- 2018
- Sprache
- deutsch
- Lizenz
- CC BY-NC-ND 4.0
- ISBN
- 978-3-03897-106-1
- Abmessungen
- 17.0 x 24.4 cm
- Seiten
- 216
- Schlagwörter
- document image processing, preprocessing, binarizationl, text-line segmentation, handwriting recognition, indic/arabic/asian script, OCR, Video OCR, word spotting, retrieval, document datasets, performance evaluation, document annotation tools
- Kategorie
- Informatik