Seite - 31 - in Document Image Processing
Bild der Seite - 31 -
Text der Seite - 31 -
J. Imaging 2018,4, 80
homogenousslant. Itdoesnot requirepagesegmentation into text linesorwords. Thismakes the
proposedtechniqueappropriate forhistoricaldocuments, especially formalones, since theyensure
auniformslantover theentirepage.Moreover, thesegmentation into text lineswouldcreatemore
noise. Usually, formal historical documents arewrittenbywell-educatedpeoplewith a standard
writing style andfixed slant. On the other hand, the proposedmethodology is inappropriate for
document imagescontainingunconstrainedwritingandseveral slantangles.Methodologies, such
as theonedescribedin[6], aremoreappropriate insuchcases. Thus,experimentsareperformedon
severaldatabases:
• theTrigraphSlantdatabase [18] (theonlyavailabledatabase forslantestimation),
• two databases of historical documents (George Washington [19] and Barcelona historical,
handwrittenmarriagesdatabaseBH2M[20])
• asyntheticprinteddatabasewhereslantsare fullydetermined.
Thecontributionof thecurrentworkconsistsof:
1. To thebestofourknowledge, this is thefirst time thata slant removal technique isproposed,
able tobeappliedto theentirepage,withoutrequiringtext lineorwordsegmentation.
2. It does not generate extra noise, due to line and/orword segmentation thatwould remain
in thepage after slant removal,which is accomplishedby shifting the entirepageuniformly
andensuring texthomogeneity. Mostof theexisted techniquesapply to idealdatabases, like
IAM-DB(Figure1) that is appropriatelymade for lineandwordsegmentation. In thecaseof
historical documents (Figure 2), thefinal resultwouldbe full of dots and strokes because of
thesegmentation.
3. Instructionsaregivenover thebestapplicationtodocumentpage,afterdetailedresults.
In Section2.1, a shortdescriptionof the elaborated slantdetectionalgorithm[9] is presented.
Theproposedtechnique isdescribed indetail inSection2.2,where theparameters thatareexamined
indetail areanalyzed. Theexperimental results arepresentedandanalyzed inSection3while the
conclusionsarediscussed inSection4.
Figure2.Adocument fromtheBarcelonahistorical,handwrittenmarriagesdatabase (BH2M)[20].
31
zurück zum
Buch Document Image Processing"
Document Image Processing
- Titel
- Document Image Processing
- Autoren
- Ergina Kavallieratou
- Laurence Likforman-Sulem
- Herausgeber
- MDPI
- Ort
- Basel
- Datum
- 2018
- Sprache
- deutsch
- Lizenz
- CC BY-NC-ND 4.0
- ISBN
- 978-3-03897-106-1
- Abmessungen
- 17.0 x 24.4 cm
- Seiten
- 216
- Schlagwörter
- document image processing, preprocessing, binarizationl, text-line segmentation, handwriting recognition, indic/arabic/asian script, OCR, Video OCR, word spotting, retrieval, document datasets, performance evaluation, document annotation tools
- Kategorie
- Informatik