Web-Books
in the Austria-Forum
Austria-Forum
Web-Books
Informatik
Document Image Processing
Page - 11 -
  • User
  • Version
    • full version
    • text only version
  • Language
    • Deutsch - German
    • English

Page - 11 - in Document Image Processing

Image of the Page - 11 -

Image of the Page - 11 - in Document Image Processing

Text of the Page - 11 -

J. Imaging 2018,4, 68 In thispaper,we learnedadictionaryD fromthe trainingsetY created fromtheoverlapping patchesofan imagewithbleed-through,usingthemethoddescribedinSection2. Foroptimization, weusedonlycompletepatches fromY, i.e., thepatcheswithnobleed-throughpixels, selectedfrom both background areas and foreground text. This choice of ‘clean patches’ speeds up the training processandexcludes the ‘non-informative’ bleed-throughpixels.Afterdictionary training, thesparse coefficients inEquation(10)areestimatedusingtheOMPalgorithmpresentedin[48]. Theorder in which thebleed-throughpatchesare inpaintedhasasignificant impactonthefinal restored image. Thus, similarly to [20],highpriority isgiventopatcheswithstructure information in theknownpart. Thispatchpriorityschemeenablesasmoothtransitionofstructure informationfromtheknownpart to theunknown(bleed-through)partof thepatch. 5. ExperimentalResults In thissection,wediscuss theperformanceofourmethodinorder tovalidate itseffectiveness. Wecomparedtheproposedmethodwithotherstate-of-the-artmethodsincluding[7,16]. Forevaluation, weusedimagesfromthewellknowndatabaseofancientdocumentspresentedin[63,64]. Thisdatabase contains25pairsof recto-verso imagesofancientmanuscriptsaffectedbybleed-through,alongwith groundtruth images. In thegroundtruth images, the foregroundtext ismanually labeled. For the proposedmethod, the input imagesarefirstprocessed forbleed-throughdetectionasdiscussed in Section3. ThedictionarytrainingdatasetY isconstructedbyselectingtheoverlappingpatchesofsize8×8 withnobleed-throughpixels fromthe input image.WelearnedanovercompletedictionaryDofsize 64×256 fromY,withsparsity levelm= 5andα= 0.26.Weuseddiscrete cosine transform(DCT) matrixasaninitialdictionary. Foreachpatchtobeinpainted, thesparsecoefficientsareestimatedusing the learneddictionaryandOMP.Thesparsecoefficientsofeachpatch,denotedbyxj,where j indicates thenumberof thepatch, are thenused toestimate thefill-invalues for thebleed-throughareas. In termsofcomputationalcomplexity, thedictionarytrainingstepcomparativelyconsumesmore time. TheK-SVDalgorithmrequiresK-timessingularvalvedecomposition(SVD),withcomputationalcost ofO(K4),whereK represents thenumber of atoms. Theproposedmethod is implemented in the MATLAB2016aplatform(TheMathWorks, Inc.,Natick,MA,USA)onapersonalcomputerwithcore i5-6500CPUat3.20Ghzand8GBofRAM.It tookabout2minfordictionary learning,and57s for inpaintinganimageof720×940pixels. In bleed-through restoration, the efficacy is generally evaluatedqualitatively, as in real cases theoriginal clean image isnot available. Avisual comparisonof theproposedmethodwithother state-of-the-artmethods is presented in Figure 3. The reported results for [16] are obtained from theonlineavailableancientmanuscriptsdatabase (https://www.isos.dias.ie/). In thegroundtruth images,obtainedfrom[7], foregroundtextandbleedthrougharemanually labeled.Ascanbeseen, theproposedmethod(Figure3e)producescomparativelybetter results considering thegivenground truthimage. Itefficientlyremovesthebleed-throughdegradation, leavesintacttheforegroundtext,and preserves theoriginal lookof thedocument. Thenon-parametricmethodof [16] (Figure3c), although retaining foregroundtextandbackgroundtexture, leavesclearlyvisiblebleed-through imprints in somecases. Therecentmethodpresented in [7] (Figure3d)producesbetter results,butsomestrokes of the foregroundtextaremissing. Ableed-throughfreecolour image,obtainedbyusing theproposedmethod, is showninFigure4. In thecaseofcolor images,weappliedtheproposed inpaintingmethodonly in the luminance (luma) band,andasimplenearest-neighbourbasedpixel interpolation isused in theother twochrominance bands. Theproposedmethodcopesverywellwithbleed-throughremovalandthedictionarybased inpaintingpreserves theoriginalappearanceof thedocument. 11
back to the  book Document Image Processing"
Document Image Processing
Title
Document Image Processing
Authors
Ergina Kavallieratou
Laurence Likforman-Sulem
Editor
MDPI
Location
Basel
Date
2018
Language
German
License
CC BY-NC-ND 4.0
ISBN
978-3-03897-106-1
Size
17.0 x 24.4 cm
Pages
216
Keywords
document image processing, preprocessing, binarizationl, text-line segmentation, handwriting recognition, indic/arabic/asian script, OCR, Video OCR, word spotting, retrieval, document datasets, performance evaluation, document annotation tools
Category
Informatik
Web-Books
Library
Privacy
Imprint
Austria-Forum
Austria-Forum
Web-Books
Document Image Processing