Web-Books
in the Austria-Forum
Austria-Forum
Web-Books
Informatik
Document Image Processing
Page - 74 -
  • User
  • Version
    • full version
    • text only version
  • Language
    • Deutsch - German
    • English

Page - 74 - in Document Image Processing

Image of the Page - 74 -

Image of the Page - 74 - in Document Image Processing

Text of the Page - 74 -

J. Imaging 2018,4, 37 2.1.DPDQC:DesignofDQCUsingDynamicProgramming Given a set of classifiers for frequent classesWw = {w1,w2, . . . ,wN} and a query vectorXq, thequeryclassifierwq isdesignedasapiecewise fusionofparts (n-grams) fromtheavailableclassifiers fromWw. Let p be the number of portions to be selected for computing the query classifierwq. These portions are characterized by the sequence of indices a1, . . . ,ap+1. The classifier synthesis problemis formulatedas thatofpickinguptheoptimalsetofclassifiers{ci}andthesetofsegment indices{ai} such that{ai} formamonotonically increasingsequenceof indices. This involves the followingoptimization: max {ai},{ci} p ∑ i=1 ai+1 ∑ k=ai wkciX k q (1) wherewci correspondstotheweightvectorof thec th i classifierthatwechooseandtheinnersummation applies the indexk in therange (ai,ai+1) touse thekth componentwkci fromtheclassifier ci. The index i in theouter summation refers to the cutportions, and p is the total numberofportionsweneed toconsider. In [12],Malisiewiczetal. proposedthe ideaofexemplarSVEN(ESVM)whereaseparate (SVM) is learnedforeachexample.Almazanetal. [25]useESVMsforretrievingwordimages. ESVMsare inherentlyhighly tunedto its correspondingexample. Givenaquery, it canretrievehighlysimilar wordimages. Thisconstrains therecall,unlessonehas largevariationsof thequerywordavailable. AnotherdemeritofESVMis the largeoverall trainingtimesinceaseparateSVMneeds tobe trained foreachexemplar.Oneapproachtoreducingtrainingtimeis tomakethenegativeexamplemining stepofflineandselectingacommonsetofnegativeexamples [26].Gharbietal. [27]provideanother alternative for fast trainingofexemplar SVM inwhich thehyperplanebetweenasinglepositivepoint and a set of negative points can be seen as finding the tangent to themanifold of images at the positivepoint. Givenaqueryq, thesimilarvectorsinthedatasetareidentifiedbyadoptingtheESVMformulation proposedbyGharbietal. [27]whichyieldsanapproachequivalent toLinearDiscriminantAnalysis. It involves a fast computation of theweight vector by adopting a parametric representation of the negative examples approximated as aGaussianmodel on the complete set of trainingpoints. Thenormal to theGaussianat thequerypointq is computedusingthecovariancematrix toyield the weightvectorwq as follows: wq=Σ−1(μq−μ0) (2) whereΣandμ0 are thecovarianceandmeancomputedover theentiredataset. SinceΣandμ0 are commonforalldata,findingwq requiresfindingthemeanvectorμqof theclass towhichthequeryq belongs to. Letusdefinethesetofclassmeanvectors for the frequentclassesasWμ={μ1, . . . ,μN}. Themeanvectorμq for theclassof thequeryq is computedbymakinguseofappropriatecutportions fromthemeanvectorsof the frequentclasses.Optimizing(1) forvariable lengthcutportionsentails highcomputationalcomplexity. Therefore, insteadofmatchingvariable-lengthn-grams, themethod dividesXq into pnumberoffixedlengthportions. 1. Theclassmeanvectorsof themost frequent1000classesareconcatenated. 2. Now,eachquerycutportionXkq is searchedin theconcatenatedmeanvectorusingsubsequence dynamic timewarping[28] 3. Themostsimilar segment in theconcatenatedmeanvector is takenas thecorrespondingportion of thequeryclassmeanμkq. 4. Theconcatenationof thesequeryclassmeancutportionsμkq synthesizes thequeryclassmean μq=[μ1q, . . . ,μ p q]. SinceDTWiscomputationallyslow,applyingsubsequenceDTW, in thiscase, is computationally expensive. 74
back to the  book Document Image Processing"
Document Image Processing
Title
Document Image Processing
Authors
Ergina Kavallieratou
Laurence Likforman-Sulem
Editor
MDPI
Location
Basel
Date
2018
Language
German
License
CC BY-NC-ND 4.0
ISBN
978-3-03897-106-1
Size
17.0 x 24.4 cm
Pages
216
Keywords
document image processing, preprocessing, binarizationl, text-line segmentation, handwriting recognition, indic/arabic/asian script, OCR, Video OCR, word spotting, retrieval, document datasets, performance evaluation, document annotation tools
Category
Informatik
Web-Books
Library
Privacy
Imprint
Austria-Forum
Austria-Forum
Web-Books
Document Image Processing