Page - 5 - in Document Image Processing
Image of the Page - 5 -
Text of the Page - 5 -
J. Imaging 2018,4, 68
dependsontheaccurateregistrationof rectoandversosidesof thedocument,which isanon-trivial
pre-processingstep.
For a plausible restoration of documents with bleed-through, in addition to bleed-through
identification,findingasuitablereplacementfortheaffectedpixels isalsoessential. Therestoredimage
generated inmostof theabovemethods is eitherbinary,pseudo-binary (uniformbackgroundand
varyingforegroundintensities),or textured(thebleed-throughregionsarereplacedwithanestimate
of the localmeanbackground intensityorwith a randompattern). Anestimateof the localmean
backgroundisusedin[6,18],butsuchmethodsaregoodformanuscriptswithareasonablysmooth
backgroundwhileproducingvisibleartifacts fordocumentswithahighly texturedbackground. In [7],
a random-fill inpaintingmethodissuggested toreplace thebleed-throughpixelswithbackground
pixels randomlyselected fromtheneighbourhood. However, the randompixel selectionproduces
salt andpepper like artifacts in regionswith largebleed-through. In [6,16], as apreliminary step,
a “clean” background for the entire image is estimated, but this is usually a very laborious task.
Inbleed-throughremoval, thedesiredrestoredimageis theonewheretheforegroundandbackground
texture ispreservedasmuchaspossible. Instead,mostof thebleed-throughremovalmethodsusually
concentrateonforegroundtextpreservation,neglectingthebackgroundtexture. Inorder toenhance
the quality of the restored image, the identification of bleed-throughpixels and the estimationof
a tenablereplacement for themshouldbeaddressedwithequalattention.
Image inpainting,whichrefers tofilling inmissingorcorruptedregions inan image, isawell
studiedandchallengingtopic incomputervisionandimageprocessing[19,20]. In image inpainting,
the goal is to find an estimate for those regions in order to reconstruct a visually pleasant and
consistent image[21]. Recently, sparserepresentationbasedimage inpaintingmethodsarereported
withexquisite results [22,23]. Thesemethodsfindasparse linearcombinationforeach imagepatch
usinganovercompletedictionary,andthenestimatethevalueofmissingpixels inthepatch. This linear
sparserepresentation iscomputedadaptively,byusingaearneddictionaryandsparsecoefficients,
trained on the image at hand.Adictionary learning basedmethod has been used for document
image resolution enhancement [24], denoising [25], and restoration [26]. In addition to sparsity,
non-localself-similarityisanothersignificantpropertyofnatural images[27,28].Anumberofnon-local
regularization terms,exploiting thenon-local self-similarity,areemployedinsolving inverseproblems
[29,30]. Fusingimagesparsitywithnon-localself-similarityproducesbetterresults inrecentlyreported
image restoration techniques [31,32]. Theunderlying assumption in suchmethods is that similar
patchesshare thesamedictionaryatoms.
In thispaper,wepresenta two-stepmethodtorestoredocumentsaffectedbybleed-throughusing
pre-registered recto andverso images. First, the bleed-throughpattern is selectively identified on
bothsides; then, sparse image inpainting isused to findsuitable fill in for thebleed-throughpixels.
Ingeneral,anyoff-the-shelfbleed-throughidentificationmethodscanbeusedinthefirst step.Here,
weadoptthealgorithmdescribedin[33],whichissimpleandveryfast.Althoughefficientinlocatingthe
bleed-throughpattern, themethodin[33] lacksaproperstrategytoreplacetheunwantedbleed-through
pixels. Thesimplereplacementwith thepredominantbackgroundgray levelvaluecausesunpleasant
imprintsof thebleed-throughpattern,visible in therestoredimage.Aninterpolationbasedinpainting
techniqueforsuch imprints ispresented in [34],but the filled-inareasaremostlysmooth.Here,weuse
asparse imagerepresentationbasedinpainting,withnon-local similarpatches, to findabefittingfill-in
for thebleed-throughpixels. Thissparse inpaintingstep,whichconstitutes themaincontributionof
thepaper,enhances thequalityof therestoredimageandpreserveswell thenaturalpaper textureand
thetextstrokeappearance. Theoptimizationproblemofsparsepatch inpainting is formulatedusing
thenon-local similarpatches, toaccount for theneighbourhoodconsistency,andorthogonalmatching
pursuit (OMP) isusedtofindthesparseapproximation.
Therestof thispaper isorganizedas follows. Thenextsectionbrieflyintroducessparse image
representationanddictionary learning. Section3presents thenon-blindbleed-throughidentification
method. Theproposed sparse image inpainting technique is described in Section 4. In Section 5,
5
back to the
book Document Image Processing"
Document Image Processing
- Title
- Document Image Processing
- Authors
- Ergina Kavallieratou
- Laurence Likforman-Sulem
- Editor
- MDPI
- Location
- Basel
- Date
- 2018
- Language
- German
- License
- CC BY-NC-ND 4.0
- ISBN
- 978-3-03897-106-1
- Size
- 17.0 x 24.4 cm
- Pages
- 216
- Keywords
- document image processing, preprocessing, binarizationl, text-line segmentation, handwriting recognition, indic/arabic/asian script, OCR, Video OCR, word spotting, retrieval, document datasets, performance evaluation, document annotation tools
- Category
- Informatik