Web-Books
in the Austria-Forum
Austria-Forum
Web-Books
Informatik
Document Image Processing
Page - 8 -
  • User
  • Version
    • full version
    • text only version
  • Language
    • Deutsch - German
    • English

Page - 8 - in Document Image Processing

Image of the Page - 8 -

Image of the Page - 8 - in Document Image Processing

Text of the Page - 8 -

J. Imaging 2018,4, 68 and smoother correspondingbleed-through text. As ameasure for “quantity of ink”having such properties,weuse theconceptofopticaldensity,which is relatedto the intensityas follows: d(i, j)=D(s(i, j))=−log ( s(i, j) b ) , (5) where s(i, j) is the image intensityatpixel (i, j), and b represent themost frequent (or theaverage) intensityvalueof thebackgroundarea in the image. Thus,basedonthephysically-motivatedassumptionsabove,weadopta linear,non-stationary model in theopticaldensities, todescribe thesuperpositionbetweenbackground, foregroundand bleed-throughin the twoobservedrectoandverso images: dobsr (i, j)= dr(i, j)+qv(i, j)D(hv(i, j)⊗sv(i, j)), dobsv (i, j)= dv(i, j)+qr(i, j)D(hr(i, j)⊗sr(i, j)), (6) for eachpixel (i, j). In Equation (6), dobs is the observedoptical density, and d is the ideal optical densityofthefree-of-interferencesimage,withthesubscriptsrandv indicatingtherectoandversoside, respectively.D is theoperator that,whenappliedto the intensity, returns theopticaldensityaccording toEquation (5), and⊗ indicatesconvolutionbetweenthe ideal intensity sandaunitvolumePoint SpreadFunction (PSF),h,describing thesmearingof inkpenetrating thepaper.Atpresent,weassume stationaryPSFs, empirically chosenasGaussian functions, althoughamore reliablemodel for the phenomenonof the inkspreadingshouldconsidernon-stationaryoperators. Finally, thespace-variant quantitiesqr andqvhave thephysicalmeaningofattenuation levelsof thedensity (or inkpenetration percentage), fromoneside to theother. Theproposedalgorithm locates the bleed-throughpixels in each side as thosewhoseoptical density is lower thanthatof thecorrespondingpixels in theoppositeside, i.e., of the foregroundthat hasgeneratedthebleed-through. Thus,onthebasisofEquation(6), ateachpixel,wefirst compute the followingratios: qr(i, j)= dobsv (i,j) D(hr(i,j)⊗sobsr (i,j))+ , qv(i, j)= dobsr (i,j) D(hv(i,j)⊗sobsv (i,j))+ . (7) Since theequationsaboveare intendedto identifybleed-throughpixels, theyarederivedfrom themodel inEquation (6) assuming that the idealopticaldensityd(i, j) is zeroon thesideathand. Asa consequenceof this assumption, theopposite, idealdensity, shouldcorrespond to thatof the foregroundtext, andthencoincidewith thedensityof theblurredobserved intensity sobs. Then, forall pixels,wemaintainthesmallestbetweenthetwocomputedattenuationlevels,andset tozerotheother. Thisallowsforcorrectlydiscriminatingthetwoinstancesof foregroundononesideandbleed-through intheother, so that,allpixelswhereqr>0areclassifiedasbleed-throughintheversoside,whereas thosewhereqv>0areclassifiedasbleed-throughin therectoside. However, it isapparent that,with thecriterionabove,wecanobtainwrongpositiveattenuation levels, ononeof the twosides, in correspondenceof somebackgroundpixels andsomeocclusion pixels, i.e.,where the twoforegroundtextssuperimposeoneachother. Thishappensbecause, in the casesbackground–backgroundandforeground–foreground, the twodensitiesarealmost thesame, aroundzero in thefirst caseandaroundthemaximumdensity in theother,withsmalloscillations that makeunpredictable thevalueof theratios. Tocorrect thispossibleoverestimationof thebleed-throughpixels,weset tozero theattenuation levelwhen thedensities dobsr and dobsv are both low (or high, respectively) and close to eachother. Weexperimentallyverifiedthat thisprocedureworkswell inmostcases.Ontheotherhand,evenif somepixels remainmisclassifiedasbleed-through, thesparse inpaintingalgorithmthatwepropose here isable toproperlyreplace themwith theoriginal, correctvalues.Asdetailed in thenextsection, 8
back to the  book Document Image Processing"
Document Image Processing
Title
Document Image Processing
Authors
Ergina Kavallieratou
Laurence Likforman-Sulem
Editor
MDPI
Location
Basel
Date
2018
Language
German
License
CC BY-NC-ND 4.0
ISBN
978-3-03897-106-1
Size
17.0 x 24.4 cm
Pages
216
Keywords
document image processing, preprocessing, binarizationl, text-line segmentation, handwriting recognition, indic/arabic/asian script, OCR, Video OCR, word spotting, retrieval, document datasets, performance evaluation, document annotation tools
Category
Informatik
Web-Books
Library
Privacy
Imprint
Austria-Forum
Austria-Forum
Web-Books
Document Image Processing