Web-Books
in the Austria-Forum
Austria-Forum
Web-Books
Informatik
Document Image Processing
Page - 45 -
  • User
  • Version
    • full version
    • text only version
  • Language
    • Deutsch - German
    • English

Page - 45 - in Document Image Processing

Image of the Page - 45 -

Image of the Page - 45 - in Document Image Processing

Text of the Page - 45 -

Journal of Imaging Article Text/Non-TextSeparationfromHandwritten DocumentImagesUsingLBPBasedFeatures: AnEmpiricalStudy SouravGhosh1,*,DibyadwatiLahiri 1,*,ShowmikBhowmik1,ErginaKavallieratou2 andRamSarkar1 1 DepartmentofComputerScienceandEngineering, JadavpurUniversity,Kolkata,WestBengal700032, India; showmik.cse@gmail.com(S.B.); raamsarkar@gmail.com(R.S.) 2 Departmentof InformationandCommunicationSystemsEngineering,UniversityofAegean, Lesbos81100,Greece;kavallieratou@aegean.gr * Correspondence: souravghosh2197@gmail.com(S.G.);dibyadwati.lahiri@gmail.com(D.L.) Received: 15December2017;Accepted: 6April2018;Published: 12April2018 Abstract: Isolatingnon-text components fromthe text componentspresent inhandwrittendocument images isan importantbut lessexploredresearcharea.Addressing this issue, in thispaper,wehave presented an empirical study on the applicability of various Local Binary Pattern (LBP) based texture features for this problem. This paper also proposes aminormodification in one of the variants of the LBP operator to achieve better performance in the text/non-text classification problem. The feature descriptors are then evaluated on a database, made up of images from 104 handwritten laboratory copies and class notes of various engineering and science branches, usingfivewell-knownclassifiers. Classificationresults reflect theeffectivenessofLBP-basedfeature descriptors in text/non-text separation. Keywords: text/non-text separation; localbinarypattern;handwrittendocument;document image processing; texture-basedfeatures 1. Introduction Documents, in themodern day, are required to be stored in digitized form to increase their longevity,portabilityandsecurity. Inorder toachievethispurpose, thedevelopmentofacomplete Document ImageProcessingSystem(DIPS)hasbecomeanutmostneed.Alongwith theothersteps, anyDIPS needs to identify the texts present in a document image separately from the non-text components like tables, diagrams, graphic designs before processing the text through anOptical CharacterRecognition(OCR)engine[1–3]. Thereasonfor this isveryobvious:OCRenginesdonot processnon-textcomponents. Researchers, todate,havereportedmanysolutions to thisproblemfor printeddocuments [4–6].However, thesameisnot true forregularhandwrittendocuments;a rather limited amount ofwork is available in this area, to thebest of our knowledge, amongwhich two significantonesare [7,8]. Indocument imageprocessing, researchersmostlyuseOCRtechnology in order toworkonwordand/orcharacter level toprovideaviable solution for informationcontent exploitation[9]. Ingeneral,handwrittendocumentsareunstructured i.e., inmostcases, thesedocumentsdonot followanyspecific layout,unlike theprinteddocuments. Thus, theappearanceof textandnon-text in handwrittendocuments isverychaotic. Forexample, text componentsoftenoverlapwith thenon-text components. Furthermore, thebuildingblocks (i.e., characters)of the text inhandwrittendocuments donot followthestandardshapeandsizeusually foundin itsprintedcounterpart. Oneof thekey difficulties in thegraphics recognitiondomain is also toworkon complex and composite symbol J. Imaging 2018,4, 57 45 www.mdpi.com/journal/jimaging
back to the  book Document Image Processing"
Document Image Processing
Title
Document Image Processing
Authors
Ergina Kavallieratou
Laurence Likforman-Sulem
Editor
MDPI
Location
Basel
Date
2018
Language
German
License
CC BY-NC-ND 4.0
ISBN
978-3-03897-106-1
Size
17.0 x 24.4 cm
Pages
216
Keywords
document image processing, preprocessing, binarizationl, text-line segmentation, handwriting recognition, indic/arabic/asian script, OCR, Video OCR, word spotting, retrieval, document datasets, performance evaluation, document annotation tools
Category
Informatik
Web-Books
Library
Privacy
Imprint
Austria-Forum
Austria-Forum
Web-Books
Document Image Processing