Seite - 7 - in Document Image Processing
Bild der Seite - 7 -
Text der Seite - 7 -
J. Imaging 2018,4, 68
for i=1,...,N,where‖.‖0 is the 0-norm,whichcounts thenon-zeroelements inx, andthedictionary
updatestage for theXobtainedfromthesparsecodingstage
D=argmin
D ‖Y−DX‖2F . (3)
Dictionary learningalgorithmsareoftensensitive to thechoiceofm. Theupdatestepcaneither
be sequential (oneatomata time) [51,52], orparallel (all atomsatonce) [53,54]. Adictionarywith
sequentialupdate,althoughcomputationallyabitexpensive,willgenerallyprovidebetterperformance
than theparallel update, due to thefiner tuningof eachdictionaryatom. In sequential dictionary
learning, thedictionaryupdateminimizationproblem(3) is split intoK sequentialminimizations,
byoptimizingthecost function(3) foreach individualatomwhilekeepingfixedtheremainingones.
Most of the proposed algorithmshave kept the two stage optimizationprocedure, the difference
appearingmainly in thedictionaryupdate stage,with someexceptionshavingadifference in the
sparsecodingstageaswell [43]. In themethodproposedin[51],whichhasbecomeabenchmark in
dictionary learning,eachcolumndkofDanditscorrespondingrowofcoefficientsxrowk areupdated
basedonarank-1matrixapproximationof theerror forall thesignalswhendk is removed
{dk,xrowk } = arg mindk,xrowk ‖Y−DX‖2F
= arg min
dk,xrowk ‖Ek−dkxrowk ‖2F, (4)
whereEk =Y−∑Ki=1,i =kdixrowi . Thesingularvaluedecomposition (SVD)ofEk =UΔV isused to
findtheclosest rank-1matrixapproximationofEk. Thedkupdate is takenas thefirst columnofU,
andthexrowk update is takenas thefirst columnofVmultipliedbythefirstelementofΔ. Toavoid the
lossof sparsity inxrowk thatwouldbecreatedbythedirectapplicationof theSVDonEk, in [51], itwas
proposedtomodifyonly thenon-zeroentriesofxrowk resultingfromthesparsecodingstage. This is
achievedbytaking intoaccountonly thesignalsyi thatuse theatomdk inEquation(4),or,by taking,
insteadof theSVDofEk, theSVDofERk =EkIwk,wherewk= {i|1≤ i≤N;xrowk (i) = 0}, and Iwk is
theN×|wk| submatrixof theN×N identitymatrixobtainedbyretainingonly thosecolumnswhose
indexnumbersare inwk.
3.Bleed-ThroughIdentification
Thealgorithmusedtorecognise thepixels thatbelongto thebleed-throughpatternmakesuseof
bothsidesof thedocument, i.e., therectoandtheverso images,andsuitablycompares their intensities
inapixel-by-pixelmodality.Hence, it is essential that twocorresponding,oppositepixelsexactly refer
to the samepieceof information. Inotherwords, at location (i, j), to thepixel in a side, let us say
ableed-throughpixel,mustcorrespond, in theoppositeside, the foregroundpixel thathasgenerated
it, and vice versa. In order to ensure thismatching, one of the two images needs to be reflected
horizontally,andthenthe twoimagesmustbeperfectlyaligned[55].
Thewayinwhichweperformthecomparisonbetweenpairsofcorrespondingpixels ismotivated
bysomeconsiderationsabout thephysicalphenomenon. Indeed, throughexperience,weobserved
that, in themajority of themanuscripts examined, due topaperporosity, the seeped inkhas also
diffusedthroughthepaperfiber.Hence, ingeneral, thebleed-throughpattern isasmearedandlighter
version of the opposite text that has generated it. Note that this assumptiondoes notmean that,
on thesameside,bleed-throughis lighter thanthe foregroundtext. In fact,oneachside, the intensity
ofbleed-through isusuallyveryvariable,which ishighlynon-stationary, andsometimescanbeas
darkas the foregroundtext.
Otherconsiderationscanbemadebyreasoningintermsof“quantityof ink”.Indeed, it isapparent
that thequantityof ink shouldbezero in thebackground, i.e., theunwrittenpaper, nomatter the
colorof thepaper,maximuminthedarkerandsharper foregroundtext,andminimuminthe lighter
7
zurück zum
Buch Document Image Processing"
Document Image Processing
- Titel
- Document Image Processing
- Autoren
- Ergina Kavallieratou
- Laurence Likforman-Sulem
- Herausgeber
- MDPI
- Ort
- Basel
- Datum
- 2018
- Sprache
- deutsch
- Lizenz
- CC BY-NC-ND 4.0
- ISBN
- 978-3-03897-106-1
- Abmessungen
- 17.0 x 24.4 cm
- Seiten
- 216
- Schlagwörter
- document image processing, preprocessing, binarizationl, text-line segmentation, handwriting recognition, indic/arabic/asian script, OCR, Video OCR, word spotting, retrieval, document datasets, performance evaluation, document annotation tools
- Kategorie
- Informatik