Page - 31 - in Document Image Processing
Image of the Page - 31 -
Text of the Page - 31 -
J. Imaging 2018,4, 80
homogenousslant. Itdoesnot requirepagesegmentation into text linesorwords. Thismakes the
proposedtechniqueappropriate forhistoricaldocuments, especially formalones, since theyensure
auniformslantover theentirepage.Moreover, thesegmentation into text lineswouldcreatemore
noise. Usually, formal historical documents arewrittenbywell-educatedpeoplewith a standard
writing style andļ¬xed slant. On the other hand, the proposedmethodology is inappropriate for
document imagescontainingunconstrainedwritingandseveral slantangles.Methodologies, such
as theonedescribedin[6], aremoreappropriate insuchcases. Thus,experimentsareperformedon
severaldatabases:
⢠theTrigraphSlantdatabase [18] (theonlyavailabledatabase forslantestimation),
⢠two databases of historical documents (George Washington [19] and Barcelona historical,
handwrittenmarriagesdatabaseBH2M[20])
⢠asyntheticprinteddatabasewhereslantsare fullydetermined.
Thecontributionof thecurrentworkconsistsof:
1. To thebestofourknowledge, this is theļ¬rst time thata slant removal technique isproposed,
able tobeappliedto theentirepage,withoutrequiringtext lineorwordsegmentation.
2. It does not generate extra noise, due to line and/orword segmentation thatwould remain
in thepage after slant removal,which is accomplishedby shifting the entirepageuniformly
andensuring texthomogeneity. Mostof theexisted techniquesapply to idealdatabases, like
IAM-DB(Figure1) that is appropriatelymade for lineandwordsegmentation. In thecaseof
historical documents (Figure 2), theļ¬nal resultwouldbe full of dots and strokes because of
thesegmentation.
3. Instructionsaregivenover thebestapplicationtodocumentpage,afterdetailedresults.
In Section2.1, a shortdescriptionof the elaborated slantdetectionalgorithm[9] is presented.
Theproposedtechnique isdescribed indetail inSection2.2,where theparameters thatareexamined
indetail areanalyzed. Theexperimental results arepresentedandanalyzed inSection3while the
conclusionsarediscussed inSection4.
Figure2.Adocument fromtheBarcelonahistorical,handwrittenmarriagesdatabase (BH2M)[20].
31
back to the
book Document Image Processing"
Document Image Processing
- Title
- Document Image Processing
- Authors
- Ergina Kavallieratou
- Laurence Likforman-Sulem
- Editor
- MDPI
- Location
- Basel
- Date
- 2018
- Language
- German
- License
- CC BY-NC-ND 4.0
- ISBN
- 978-3-03897-106-1
- Size
- 17.0 x 24.4 cm
- Pages
- 216
- Keywords
- document image processing, preprocessing, binarizationl, text-line segmentation, handwriting recognition, indic/arabic/asian script, OCR, Video OCR, word spotting, retrieval, document datasets, performance evaluation, document annotation tools
- Category
- Informatik