Web-Books
in the Austria-Forum
Austria-Forum
Web-Books
Informatik
Document Image Processing
Page - 73 -
  • User
  • Version
    • full version
    • text only version
  • Language
    • Deutsch - German
    • English

Page - 73 - in Document Image Processing

Image of the Page - 73 -

Image of the Page - 73 - in Document Image Processing

Text of the Page - 73 -

J. Imaging 2018,4, 37 therearesimilaritiesbetweenthe topalignments (least costalignments)ofdifferentpairsof sequences. In [20], theauthorsexploredthesesimilaritiesby learningasmall setofglobalprincipalalignments fromthegivendata,whichcapturesall thepossiblecorrelations in thedata. Theseglobalprincipal alignments are then used to compute the DTW distance for the new test sequences. Since these methods [19,20] avoid the computationof optimal alignments, these are computationally efficient comparedtonaiveDTWdistance. ThefastapproximateDTWdistancecanbeusedforefficient indexing inDQCclassifier.However, itgivessub-optimal results. Forbest results, itneedsqueryspecificglobal principalalignments. In thispaper,weintroducequeryspecificDTWdistance,whichenables thedirect designofglobalprincipalalignments fornovelqueries. Globalprincipalalignmentsarecomputed forasetof frequentclassesandseamlesslyextendedfor therareandarbitraryqueries,asandwhen required,withoutusing languagespecificknowledge. This isadistinctadvantageoveranOCRengine, which is difficult to adapt to varied fonts andnoisy images andwould require language specific knowledgetogeneratepossiblehypotheses foroutofvocabularywords.Moreover,anOCRenginecan respondtoawordimagequeryonlybyfirstconverting it into text,which isagainpronetorecognition errors. In [21,22], deep learning frameworksareused forword spotting. In [23], a attributebased learningmodelPHOC ispresentedforwordspotting. In trainingphase,eachwordimageis tobegiven with its transcription. Bothword imagefeaturevectorsandits transcriptionsareusedtocreate the PHOCrepresentation.AnSVMis learnedforeachattribute in this representation.Ourapproachbears similaritywiththePHOCrepresentationbasedwordspotting[23]. Inthissense,boththeapproachesare designedforhandlingout-of-vocabularyqueries.Ourworktakesadvantageofgranulardescription atngrams (cut-portion) level. This somewhat resembles thearrangementof charactersused in the PHOCencoding.However, trainingefforts for PHOCaresubstantialwitha largenumberofclassifiers (604classifiers)beingtrainedandrequirescompletedata for training,which ishugefor largedatasets. Inourwork, the amountof trainingdata is restricted toonly frequent classes,which ismuch less comparedtoPHOC. Further,PHOCrequireslabels intheformoftranscriptions,whereasinourworkthe labelsneednotbe transcriptions. Inaddition, PHOC is languagedependent [24]andit isverydifficult toapplyoverdifferent languages. Themethodproposed in thispaper is language independent; it can beappliedtoanylanguage. Thepaper isorganizedas follows. Thenextsectiondescribes theDirectqueryclassifier (DQC). Fast approximationof (DTW) distance is discussed in Section 3. Thequery specific DTWdistance ispresented inSection4. Experimental settingsandresults arediscussed inSection5, followedby concludingremarks inSection6. 2.DirectQueryClassifier (DQC) In[18],Ranjanetal. proposedDirectQueryClassifier (DQC),whichisaone-shot learningscheme for dynamically synthesizing classifiers for novel queries. Themain idea is to compute an SVM classifier for thequeryclassusing theclassifiersobtainedfromthe frequentclassesof thedatabase. Thenumberofpossiblewords ina languagecouldbevery largeanditwouldbepracticallydifficult to buildaclassifier foreachof thewords.However, all thesewordscomefromasmall setofn-grams. Thewordscorrespondingto the frequentqueriesareexpectedtocontain then-gramsthatcover the fullvocabulary. ExemplarSVMclassifiersarecomputedfor the frequentqueries (wordclasses)and thenappropriatelyconcatenatedtocreatenovelclassifiers for therarequeries.However, thisprocess has its challengesdueto (i) Variationsduetonatureofscriptandwritingstyle, (ii) Classifiers forsmallerngramscouldbenoisy. Theauthorsaddress these limitationsbybuildingtheSVMclassifiers formost frequentqueries anduseclassifier synthesisonly for rarequeries. This improves itsoverallperformance. Theyuse Query Expansion (QE) for further improving the performance. An overviewof the direct query classifier isgiven in the followingsections. 73
back to the  book Document Image Processing"
Document Image Processing
Title
Document Image Processing
Authors
Ergina Kavallieratou
Laurence Likforman-Sulem
Editor
MDPI
Location
Basel
Date
2018
Language
German
License
CC BY-NC-ND 4.0
ISBN
978-3-03897-106-1
Size
17.0 x 24.4 cm
Pages
216
Keywords
document image processing, preprocessing, binarizationl, text-line segmentation, handwriting recognition, indic/arabic/asian script, OCR, Video OCR, word spotting, retrieval, document datasets, performance evaluation, document annotation tools
Category
Informatik
Web-Books
Library
Privacy
Imprint
Austria-Forum
Austria-Forum
Web-Books
Document Image Processing