Page - 29 - in Document Image Processing
Image of the Page - 29 -
Text of the Page - 29 -
Journal of
Imaging
Article
SlantRemovalTechniqueforHistorical
DocumentImages
ErginaKavallieratou1,*,LaurenceLikforman-Sulem2andNikosVasilopoulos1
1 Department InformationandCommunicationSystemsEngineering,Universityof theAegean,Samos83200,
Greece;nvasilopoulos@aegean.gr
2 InstitutMines-Télécom/TélécomParisTech,UniversitéParis-Saclay,75013Paris,France;
laurence.likforman@telecom-paristech.fr
* Correspondence: kavallieratou@aegean.gr
Received: 14May2018;Accepted: 5 June2018;Published: 12 June2018
Abstract:Slantedtexthasbeendemonstratedtobeasalient featureofhandwriting. Itsestimation is
anecessarypreprocessing task inmanydocument imageprocessingsystems inorder to improve
therequired training. Thispaperdescribesandevaluatesanewtechnique for removing theslant
fromhistoricaldocumentpages thatavoids thesegmentationprocedure into text linesandwords.
Theproposedtechniquefirst reliesonslantangledetectionfromanaccurateselectionof fragments.
Then,aslant removal technique isapplied.However, thepresentedslant removal techniquemay
becombinedwithanyotherslantdetectionalgorithm.Experimental resultsareprovidedfor four
document imagedatabases: twohistoricaldocumentdatabases, theTrigraphSlantdatabase (theonly
databasededicatedtoslant removal), andaprinteddatabase inorder tochecktheprecisionof the
proposedtechnique.
Keywords: slant removal;document imageprocessing;document imagepage
1. Introduction
Inhandwriting, slant removal isanecessarycomponentof the textnormalizationprocedure in
systemsthatperformrecognition(e.g.,optical character recognition(OCR) [1]orword-spotting[2]), in
order to improvethe trainingprocedure (lesssamples, lowercomputationalcost).Moreover,writer
identification/verificationsystemsalsouse slant estimationand/ordetection [3]. After ideal slant
removalprocessing, the text shouldappearwith thevertical stokesparallel to theperpendicularaxis
of thepage. Due to its importance,manyresearchershavealreadydeveloped techniques for slant
removal [4–17].
Theavailable techniquesmaybedividedinto threecategories:
1. Techniques thatestimate theslantbyaveragingtheanglesof thenear-vertical strokes [4–7].
2. Techniques thatanalyzeprojectionhistograms[8,9]anddetect theslantbasedonapre-defined
criterion(e.g., aparametermaximizationorminimization).
3. Techniques thatarebasedonthestatisticsofchain-codedcontours [10–12].
Consideringtheapplicationthese techniquescanhandle, theycanbefurtherclassifiedinto:
1. Uniformslantestimationandremoval techniques [4–12]: theydealwithuniformslantallover
the text.
2. Non-uniformslantcorrection techniques [13–16]: theyhandle thecharactersapartanddealwith
theexistenceofseveral slants, simultaneously.
J. Imaging 2018,4, 80 29 www.mdpi.com/journal/jimaging
back to the
book Document Image Processing"
Document Image Processing
- Title
- Document Image Processing
- Authors
- Ergina Kavallieratou
- Laurence Likforman-Sulem
- Editor
- MDPI
- Location
- Basel
- Date
- 2018
- Language
- German
- License
- CC BY-NC-ND 4.0
- ISBN
- 978-3-03897-106-1
- Size
- 17.0 x 24.4 cm
- Pages
- 216
- Keywords
- document image processing, preprocessing, binarizationl, text-line segmentation, handwriting recognition, indic/arabic/asian script, OCR, Video OCR, word spotting, retrieval, document datasets, performance evaluation, document annotation tools
- Category
- Informatik