Page - 6 - in Document Image Processing
Image of the Page - 6 -
Text of the Page - 6 -
J. Imaging 2018,4, 68
wecommentonasetofexperimental results, illustratingtheperformanceof theproposedmethodand
itscomparisonwithstate-of-the-artmethods. Theconcludingremarksaregiven inSection6.
2. SparseImageRepresentation
Recently, sparse representation emerged as a powerful tool for efficient representation and
processing of high-dimensional data. In particular, sparsity based regularization has achieved
great success, offering solutions that outperformclassical approaches invarious imageandsignal
processingapplications.Amongtheothers,wecanmentioninverseproblemssuchasdenoising[35,36],
reconstruction[22,37], classification[38], recognition[39,40], andcompression[41,42]. Theunderlying
assumptionofmethodsbasedonsparserepresentation is thatsignalssuchasaudioandimagesare
naturallygeneratedbyamultivariate linearmodel,drivenbyasmallnumberofbasisorregressors.
The basis set, called dictionary, is either fixed andpredefined, i.e., Fourier,Wavelet, Cosine, etc.,
oradaptively learnedfromatrainingset [43].While theunderlyingkeyconstraintofall thesemethods
is that theobservedsignal is sparse, explicitlymeaningthat it canbeadequatelyrepresentedusing
asmall setofdictionaryatoms, theparticularityof thosebasedonadaptivedictionaries is that the
dictionary isalso learnedtofindtheonethatbestdescribes theobservedsignal.
Given a data set Y = [y1,y2,...,yN] ∈ Rn×N, its sparse representation consists of learning
anovercompletedictionary,D∈Rn×K,N>K>n, andasparsecoefficientmatrix,X∈RK×Nwith
non-zeroelements less thann , suchthatyi≈Dxi, bysolvingtheoptimizationproblemgivenas
min
D,X ||Y−DX||2F s.t. ‖xi ‖p≤m,
where thexi are the columnvectorsofX,m is thedesired sparsity level, and‖ · ‖p is the p-norm,
with0≤ p≤1.
Mostof thesemethodsconsistofa twostageoptimizationscheme: sparsecodinganddictionary
update [43]. In thefirst stage, thesparsityconstraint isusedtoproduceasparse linearapproximation
of theobserveddata,withrespect to thecurrentdictionaryD. Findingtheexact sparseapproximation
isanNP-hard(non-deterministicpolynomial-timehard)problem[44],butusingapproximatesolutions
hasproventobeagoodcompromise.CommonlyusedsparseapproximationalgorithmsareMatching
Pursuit (MP) [45], Basis Pursuits (BP) [46], FocalUnderdeterminedSystemSolver (FOCUSS) [47],
andOrthogonalMatchingPursuit (OMP)[48]. In thesecondstage,basedonthecurrentsparsecode,
thedictionary isupdatedtominimizeacost function.Differentcost functionshavebeenusedfor the
dictionaryupdate, forexample, theFrobeniusnormwithcolumnnormalizationhasbeenwidelyused.
Sparserepresentationmethods iteratebetweenthesparsecodingstageandthedictionaryupdatestage
until convergence. Theperformanceof thesemethods stronglydependson thedictionaryupdate
stage, sincemostof themshareasimilarsparsecoding[43].
Asper thedictionary that leads to sparsedecomposition, althoughworkingwithpre-defined
dictionariesmaybesimpleandfast, theirperformancemightbenotgoodforevery task,dueto their
global-adaptivitynature [49]. Instead, learneddictionariesareadaptive toboth thesignalsandthe
processingtaskathand, thusresulting ina farbetterperformance [50].
Foragivensetof signalsY,dictionary learningalgorithmsgeneratearepresentationofsignalyi
asasparse linearcombinationof theatomsdk fork=1,...,K,
yˆi=Dxi. (1)
Dictionary learningalgorithmsdistinguish themselves fromtraditionalmodel-basedmethodsby
the fact that, inadditiontoxi, theyalso train thedictionaryD tobetterfit thedatasetY. Thesolution
isgeneratedbyiterativelyalternatingbetweenthesparsecodingstage,
xˆi=argminxi ‖yi−Dxi ‖2; subject to‖xi ‖0≤m (2)
6
back to the
book Document Image Processing"
Document Image Processing
- Title
- Document Image Processing
- Authors
- Ergina Kavallieratou
- Laurence Likforman-Sulem
- Editor
- MDPI
- Location
- Basel
- Date
- 2018
- Language
- German
- License
- CC BY-NC-ND 4.0
- ISBN
- 978-3-03897-106-1
- Size
- 17.0 x 24.4 cm
- Pages
- 216
- Keywords
- document image processing, preprocessing, binarizationl, text-line segmentation, handwriting recognition, indic/arabic/asian script, OCR, Video OCR, word spotting, retrieval, document datasets, performance evaluation, document annotation tools
- Category
- Informatik