Web-Books
im Austria-Forum
Austria-Forum
Web-Books
Informatik
Document Image Processing
Seite - 45 -
  • Benutzer
  • Version
    • Vollversion
    • Textversion
  • Sprache
    • Deutsch
    • English - Englisch

Seite - 45 - in Document Image Processing

Bild der Seite - 45 -

Bild der Seite - 45 - in Document Image Processing

Text der Seite - 45 -

Journal of Imaging Article Text/Non-TextSeparationfromHandwritten DocumentImagesUsingLBPBasedFeatures: AnEmpiricalStudy SouravGhosh1,*,DibyadwatiLahiri 1,*,ShowmikBhowmik1,ErginaKavallieratou2 andRamSarkar1 1 DepartmentofComputerScienceandEngineering, JadavpurUniversity,Kolkata,WestBengal700032, India; showmik.cse@gmail.com(S.B.); raamsarkar@gmail.com(R.S.) 2 Departmentof InformationandCommunicationSystemsEngineering,UniversityofAegean, Lesbos81100,Greece;kavallieratou@aegean.gr * Correspondence: souravghosh2197@gmail.com(S.G.);dibyadwati.lahiri@gmail.com(D.L.) Received: 15December2017;Accepted: 6April2018;Published: 12April2018 Abstract: Isolatingnon-text components fromthe text componentspresent inhandwrittendocument images isan importantbut lessexploredresearcharea.Addressing this issue, in thispaper,wehave presented an empirical study on the applicability of various Local Binary Pattern (LBP) based texture features for this problem. This paper also proposes aminormodification in one of the variants of the LBP operator to achieve better performance in the text/non-text classification problem. The feature descriptors are then evaluated on a database, made up of images from 104 handwritten laboratory copies and class notes of various engineering and science branches, usingfivewell-knownclassifiers. Classificationresults reflect theeffectivenessofLBP-basedfeature descriptors in text/non-text separation. Keywords: text/non-text separation; localbinarypattern;handwrittendocument;document image processing; texture-basedfeatures 1. Introduction Documents, in themodern day, are required to be stored in digitized form to increase their longevity,portabilityandsecurity. Inorder toachievethispurpose, thedevelopmentofacomplete Document ImageProcessingSystem(DIPS)hasbecomeanutmostneed.Alongwith theothersteps, anyDIPS needs to identify the texts present in a document image separately from the non-text components like tables, diagrams, graphic designs before processing the text through anOptical CharacterRecognition(OCR)engine[1–3]. Thereasonfor this isveryobvious:OCRenginesdonot processnon-textcomponents. Researchers, todate,havereportedmanysolutions to thisproblemfor printeddocuments [4–6].However, thesameisnot true forregularhandwrittendocuments;a rather limited amount ofwork is available in this area, to thebest of our knowledge, amongwhich two significantonesare [7,8]. Indocument imageprocessing, researchersmostlyuseOCRtechnology in order toworkonwordand/orcharacter level toprovideaviable solution for informationcontent exploitation[9]. Ingeneral,handwrittendocumentsareunstructured i.e., inmostcases, thesedocumentsdonot followanyspecific layout,unlike theprinteddocuments. Thus, theappearanceof textandnon-text in handwrittendocuments isverychaotic. Forexample, text componentsoftenoverlapwith thenon-text components. Furthermore, thebuildingblocks (i.e., characters)of the text inhandwrittendocuments donot followthestandardshapeandsizeusually foundin itsprintedcounterpart. Oneof thekey difficulties in thegraphics recognitiondomain is also toworkon complex and composite symbol J. Imaging 2018,4, 57 45 www.mdpi.com/journal/jimaging
zurück zum  Buch Document Image Processing"
Document Image Processing
Titel
Document Image Processing
Autoren
Ergina Kavallieratou
Laurence Likforman-Sulem
Herausgeber
MDPI
Ort
Basel
Datum
2018
Sprache
deutsch
Lizenz
CC BY-NC-ND 4.0
ISBN
978-3-03897-106-1
Abmessungen
17.0 x 24.4 cm
Seiten
216
Schlagwörter
document image processing, preprocessing, binarizationl, text-line segmentation, handwriting recognition, indic/arabic/asian script, OCR, Video OCR, word spotting, retrieval, document datasets, performance evaluation, document annotation tools
Kategorie
Informatik
Web-Books
Bibliothek
Datenschutz
Impressum
Austria-Forum
Austria-Forum
Web-Books
Document Image Processing