Seite - 176 - in Document Image Processing
Bild der Seite - 176 -
Text der Seite - 176 -
J. Imaging 2017,3, 62
Figure3. Inkdegradationonanolddocument. (Left)original image. (Right)degradedimage.
3.2. PhantomCharacter
Thephantomcharacter isa typicaldefect thatappearsondocuments thathavebeenmanually
printed(usingawoodenormetal character).Aftermanyuses,aprintingcharactercanbeeroded. It is
thuspossible that inkreaches thebordersof thepiece;bordersare thenprintedonthesheetofpaper.
DocCreatorprovidesanalgorithmthat reproducessuchinkapparitionaroundthecharacters. Tobe
asrealisticaspossible,wehavemanuallyextractedmore than30phantomdefects fromreal images.
Thesedefectsare thenautomaticallyputbetweencharacters followingapatch-basedalgorithm.
Thedegradationalgorithmworksas follow: (1) theuserprovidesan imageandthepercentof
character todegrade; (2) charactersareextractedusingaconnectedcomponentalgorithm; (3)a list
of characters is randomly set; (4) for each selected character; (4.1) a phantomdefect is randomly
selectedfromthemanuallyextractedavailabledefects; (4.2) thephantomdefect is resizedtofitwith
thecharactersize; (4.3) toberealistic, thephantomdefect isusedonlyasapattern; thepixelswithin
thepatternare transformedusingapatchalgorithminspired from[48]whereazone fromanother
partof thedocument image isselectedandcopiedwithin thepatch.
SeeFigure4 foranexample.
Figure4.Phantomcharacterapparition. (Left)original image. (Right)degradedimage.
3.3. PaperHoles
Manyoldorrecentdocument imagescontainholes. Theseholeshavedifferentshapes, sizesand
locations. DocCreator provides an algorithm that creates different kinds of holes in a document
image. Thisalgorithmsimplyrandomlyappliesholesextractedfromrealdocument imagesonagiven
document image. SeeFigure5 forexamples.
Figure5.Cont.
176
zurück zum
Buch Document Image Processing"
Document Image Processing
- Titel
- Document Image Processing
- Autoren
- Ergina Kavallieratou
- Laurence Likforman-Sulem
- Herausgeber
- MDPI
- Ort
- Basel
- Datum
- 2018
- Sprache
- deutsch
- Lizenz
- CC BY-NC-ND 4.0
- ISBN
- 978-3-03897-106-1
- Abmessungen
- 17.0 x 24.4 cm
- Seiten
- 216
- Schlagwörter
- document image processing, preprocessing, binarizationl, text-line segmentation, handwriting recognition, indic/arabic/asian script, OCR, Video OCR, word spotting, retrieval, document datasets, performance evaluation, document annotation tools
- Kategorie
- Informatik