Page - 30 - in Document Image Processing
Image of the Page - 30 -
Text of the Page - 30 -
J. Imaging 2018,4, 80
Recently, Brink et al. [3] categorized the proposed techniques by angle-frequency and
repeated-shearingapproaches thataredescribedas follows:
1. Angle-frequencyapproach:Down-strokesareļ¬rst locatedbasedonsuchcriteriaas theminimum
verticalextentorvelocity.Next, theangleof the local inkdirection ismeasuredat these locations
andtheresultinganglesareagglomerated inahistogram. Fromthishistogram, theslantangle is
determined. This isaone-stepprocedure.
2. Repeated-shearing approach: Thismethod is basedon the assumption that theprojectionof
dark pixels ismaximized along an axis parallel to the slant angle. The basic principle is to
repeatedly shear imagesof individual text lines, varying the shear angle, andoptimizing the
verticalprojectionofdarkpixels. Thisapproach isclearlymore timeconsuming,butprovesmore
accurate,as indicatedbyitspopularity.
Theļ¬rst categorywill be referred to here as āslant estimationā (one-stepprocedure), and the
secondcategory is referredtoasslantdetection, since thismethodsearchesamongmany, for themost
commonangle. Slantestimationtechniquesarepresented in [4ā7],whereasaslantdetection technique
ispresentedin[9].AccordingtoBrinketal. [3], theslantdetectiontechniquesare themostpopular
with themostprecise results. The techniquedescribed in [9] isalsoused in thatpaperwhereextensive
experimentsoverslantareperformed.Lastbutnot least, in thespeciļ¬cexperiments, thepageswere
shearedentirely, since thealternative lineorwordsegmentation ischaracterizedasāless reliableand
breaks inktracesat regionboundariesā [3]. Theproposedtechniquesuptonowrequire lineorword
segmentation inorder tobeapplied. InFigure1,anexampleof theslant removalalgorithmdescribed
in [9], ispresented. The image is fromtheIAMHandwritingDatabase (IAM-DB),andtheapplication
of thealgorithmrequires imagesegmentation into text lines (Figure1,horizontal stripes). For this
example, text linesegmentationcouldsucceedsince text linesarespacedenough. It isnot thecase
for thedocument image shown inFigure2 (17th century)which includes touchingascenders and
descendersandnoise in the inter-linespace. Sinceallexistingalgorithmsperformslantremovalon
wordor text line level, asegmentation-freeapproach isdesirable fordifļ¬cult tosegmentdocuments.
Moreover,avoidingthe text-linesegmentationprocessing iscomputationally lessexpensive.
Figure1.Anexampleofaslant removalapplicationresultingfromthedetectionalgorithmdescribed
in [9]. Text-linesegmentation is requiredprior toslantestimation.
Apreliminaryapproachhasbeendescribedin[17],while in thispaper theparametersetupis
consideredanddescribed indetail.Moreover, theapproach isextensivelyevaluatedonnewdatabases.
Theproposedtechnique isappropriate forslantdetectionandremoval fromdocument imageswith
30
back to the
book Document Image Processing"
Document Image Processing
- Title
- Document Image Processing
- Authors
- Ergina Kavallieratou
- Laurence Likforman-Sulem
- Editor
- MDPI
- Location
- Basel
- Date
- 2018
- Language
- German
- License
- CC BY-NC-ND 4.0
- ISBN
- 978-3-03897-106-1
- Size
- 17.0 x 24.4 cm
- Pages
- 216
- Keywords
- document image processing, preprocessing, binarizationl, text-line segmentation, handwriting recognition, indic/arabic/asian script, OCR, Video OCR, word spotting, retrieval, document datasets, performance evaluation, document annotation tools
- Category
- Informatik