Page - 77 - in Document Image Processing
Image of the Page - 77 -
Text of the Page - 77 -
J. Imaging 2018,4, 37
each cut portion is of length d. Similarly, divide the query Xq into samenumber p of fixed
lengthportions.
(ii) Foreachclass, compute theglobalprincipalalignments foreachcutportionseparately. Theseare
the cut specific principal alignments for the class. For ith class and jth cut portion the cut
specific principal alignments are computed from {xji1, . . .x j
i|ci|} and these are denoted as G j
i.
Thesealignmentsarecomputedforall thecutportions foreachclass.
(iii) Thefinalstepcomputes thecutspecificprincipalalignments for thegivenqueryXq as follows.
ForeachcutportionofXq,wecomputetheDTWdistance(Euclideandistanceover thecutspecific
principal alignments)with the corresponding cut portions of all the classmeans using their
correspondingcutspecificprincipalalignments. Thedistancebetweenthe jthcutportionofXq
i.e.,Xjq andthe jthcutportionof the ithclassmeani.e.,μ j
i isdenotedas
Disji= ∑
π∈Gji Euclidπ(X j
q,μ j
i) (4)
ForeachcutportionofXq,wecompute theminimumdistancemeancutportionoverall theclass
meanvectors. Thecorrespondingcutspecificprincipalalignmentsof theclosestmatchingmean
cutportionsaretakenasthecutspecificprincipalalignmentsof thequerycutportion. Inaddition,
thecorrespondingclassmeancutportion is takenas thematchingcutportionforconstructing
thequerymean. Let the jthcutportionof thequeryhavethebestmatchwith the jthcut-portion
of theclasswith index c.
c= argmin
i Disji (5)
Here theminimumdistance iscomputedoverall the frequentclasses.Wethushave
GjXq ←−G j
c and μ j
q←−μjc (6)
HereGjXq is thecutspecificprincipalalignments for the jthcutportionofXq.
Together, all thesequerymeancutportionsgive thequeryclassmean. Thequeryclassmean
μq isgivenasμq=(μ1q,μ2q, . . . ,μ p
q). Thisqueryclassmeanμq is thenusedas inEquation (2) to
compute theLDAweightwq (queryclassifierweight).
Thequeryspecific (QS)DTWdistancebetween thequeryXq andasampleX fromthedata is
givenas
dtw
qs (Xq,X)= p
∑
i=1 dtwGiXq (Xiq,X i) (7)
where p is thenumberofcutportions.
Figure3showsall theprocessingstagesof thenearestneighborDQC. Tosummarize,wegenerate
query specificprincipal alignments on theflyby selecting and concatenating the global principal
alignmentscorrespondingtothesmallerngrams (cutportions).Ourstrategy is tobuildcut-specific
principal alignments for themost frequent classes; these are theword classes thatwill bequeried
more frequently. Thesecut-specificprincipalalignmentsare thenusedtosynthesize thequeryspecific
principalalignments (seeFigure4). Theresultsdemonstrate thatourstrategygivesgoodperformance
forqueries fromboththe frequentwordclassesandrarewordclasses.
77
back to the
book Document Image Processing"
Document Image Processing
- Title
- Document Image Processing
- Authors
- Ergina Kavallieratou
- Laurence Likforman-Sulem
- Editor
- MDPI
- Location
- Basel
- Date
- 2018
- Language
- German
- License
- CC BY-NC-ND 4.0
- ISBN
- 978-3-03897-106-1
- Size
- 17.0 x 24.4 cm
- Pages
- 216
- Keywords
- document image processing, preprocessing, binarizationl, text-line segmentation, handwriting recognition, indic/arabic/asian script, OCR, Video OCR, word spotting, retrieval, document datasets, performance evaluation, document annotation tools
- Category
- Informatik