Seite - 46 - in Document Image Processing

Bild der Seite - 46 -

Text der Seite - 46 -

J. Imaging 2018,4, 57 recognition, retrieval and spotting [10]. Thus, the separation of text andnon-text in handwritten documents iscomparablycomplex than inprinteddocuments. Mostly, thereportedsolutionstotheproblemoftextandnon-textseparationaredoneeitherat the region level [4]orat theconnectedcomponent (CC) level [5,6].Methods that implement text/non-text separationat theregion level initiallyperformregionsegmentationandthenclassifyeachsegmented regionaseitheratextorgraphicsregion. Forclassifyingthesegmentedregions,researchershavemostly usedtexturebasedfeatures likeGrayLevelCo-occurrenceMatrix (GLCM)[4,11]Run-lengthbased features [12,13]orwhite tilesbasedfeatures [14].However, regionsegmentationbasedmethodsare verysensitivetothesegmentationresults. Poorsegmentationcancauseasigniﬁcantdegradationinthe classiﬁcationresult.Ontheotherhand,asCCbasedmethodsworkat thecomponent level, theydonot suffer fromsuchaproblem.Methods that followaCCbasedapproachuseshape-basedfeatures [5,6]. Ingeneral,methodsreported in this literature for text/non-text separation inhandwrittendocuments havemostly followed theCCbasedapproach [7,8]. It isworthmentioninghere that, ashistorical handwrittenmanuscriptssuffer fromvariousqualitydegradation issues, techniques likebinarization andCCextractionbecomeveryerrorprone. Thus, insomerecentarticles [15–18], researchershave followedapixelbasedapproach,whichavoids thebinarizationandCCextractionsteps. From the available research work on this topic, it can be observed that texture features likeGLCM(GrayLevelCo-occurrencematrix) [4,11], Run-length encodingbased features [12,13], Black-and-white transitionalmatrixbasedfeature [19]havebeencommonlyusedbyresearchers to solve the text/non-text separationproblemforprinteddocuments, aswellas toseparatehandwritten andprintedtext sections indocuments [20]. Inarecentwork[8], aRotationInvariantUniformLocal BinaryPattern (RIULBP)operatorhasalsobeenusedsuccessfully toseparate the textandnon-text components inhandwrittenclass-notes. Texture featureshaveproventobeveryuseful in theﬁeld of text/non-textseparationdueto thefact that text regionsandgraphicsregions inmostcaseshave verydifferentpatterns,whichcanbeexploitedtodifferentiatebetweenthem.Motivatedbythis fact, in thepresentwork,wehaveattemptedtoevaluate theperformanceofdifferentLocalBinaryPattern (LBP)based texture features to classify the componentspresent inhandwrittendocuments as text ornon-text. Thekeycontributionsofourpaperareas follows: 1. Wehavegivenadetailedanalysisofhowaccurately featuresextractedbydifferentvariantsof the LBPoperator fromhandwrittendocument imageshelp indifferentiatingtextcomponents from non-textones,which isoneof themostchallengingresearchareas in thedomainofdocument imageprocessing. For thatpurpose,wehaveconsideredﬁvevariantsofLBP[21],namely, the basicLBP[22], improvedLBP[23], rotation invariantLBP[22],uniformLBP[22], androtation invariantanduniformLBP[22]. 2. Thecontentsof thedataset,usedhere forevaluation,havecomplex textandnon-textcomponents aswell asvariations in termsof scripts, aswehaveconsideredbothBanglaandEnglish texts. Inadditionto that, someof thedocumentshavehandwrittenaswellasprintedtexts. 3. Wehavealsomadeaminoralteration torobustLBP[24] inorder todeveloprobustanduniform LBP.Amethod todetermine the appropriate threshold valueused in this variant of LBP for handwrittendocumentshasalsobeenproposed. 2. LocalBinaryPatternsandItsVariants LBPwasﬁrst introduced byOjala [25,26], as a computationally simple texture operator in a monochrometexture image. ThegeneralizeddeﬁnitionofLBP,given in[22],usedMsamplepointsevenlyplacedonacircle of radiusRwith itscenterpositionedat (xcen,ycen). Theposition (xp,yp)of theneighboringpointp, where p∈0,1, ...,M−1 isgivenby (xp,yp)=(xcen+Rcos(2πp/M),ycen−Rsin(2πp/M)). (1) 46

zurück zum Buch Document Image Processing"

Document Image Processing

Titel: Document Image Processing
Autoren: Ergina Kavallieratou; Laurence Likforman-Sulem
Herausgeber: MDPI
Ort: Basel
Datum: 2018
Sprache: deutsch
Lizenz: CC BY-NC-ND 4.0
ISBN: 978-3-03897-106-1
Abmessungen: 17.0 x 24.4 cm
Seiten: 216
Schlagwörter: document image processing, preprocessing, binarizationl, text-line segmentation, handwriting recognition, indic/arabic/asian script, OCR, Video OCR, word spotting, retrieval, document datasets, performance evaluation, document annotation tools
Kategorie: Informatik