Web-Books
in the Austria-Forum
Austria-Forum
Web-Books
Informatik
Document Image Processing
Page - 72 -
  • User
  • Version
    • full version
    • text only version
  • Language
    • Deutsch - German
    • English

Page - 72 - in Document Image Processing

Image of the Page - 72 -

Image of the Page - 72 - in Document Image Processing

Text of the Page - 72 -

J. Imaging 2018,4, 37 advantagethatitdoesnotrequirepriorlearningduetoitsappearance-basedmatching.Thesetechniques havebeenpopularlyusedindocument imageretrieval. Konidarisetal. [5]retrievewordsfromalargecollectionofprintedhistoricaldocuments.Asearch keyword typed by the user is converted into a synthetic word imagewhich is used as a query image. Wordmatching is based on computing the L1 distancemetric between the query feature and all the features in the database. Here the features are calculated using the density of the character pixels and the area that is formed from the projections of the upper and lower profile of theword. Therankedresultsare further improvedbyrelevance feedback. SankarandJawahar [7] havesuggestedaframeworkofprobabilistic reverseannotationforannotatinga largecollectionof images.Wordimagesweresegmentedfrom500Telugubooks.Matchingof thewordimages isdone using theDTWapproach [11]. Hierarchical agglomerative clusteringwasused to cluster theword images. Exemplars for thekeywordsaregeneratedbyrenderingthewordto formakeyword-image. Annotation involved identifying the closestword cluster to each keyword cluster. This involves estimatingtheprobability thateachclusterbelongs to thekeyword.YalnizandManmatha[4]have appliedwordspotting to scannedEnglishandTelugubooks. Theyareable tohandlenoise in the document textbytheuseof SIFT featuresextractedonsalientcornerpoints. RathandManmatha[11] usedprojectionprofileandwordprofile features inaDTWbasedmatchingtechnique. Recognition free retrieval was attempted in the past for printed as well as handwritten documentcollections [4,7,12,13]. Sincemostof thesemethodsweredesignedforsmallercollections (fewhandwrittendocumentsas in [12]), computational timewasnotamajorconcern.Methods that extended this to a larger collection [14–16] usedmostly (approximate) nearest neighbor retrieval. For searching complex objects in large databases, SVMs have emerged as themost popular and accurate solution in the recent past [12]. For linear SVMs, both training and testing have become veryfastwith the introductionofefficientalgorithmsandexcellent implementations [17].However, there are two fundamental challenges in using a classifier based solution for word retrieval (i) A classifier needs a good amount of annotated training data (both positive and negative) for training. Obtainingannotateddata foreveryword ineverystyle ispractically impossible. (ii)One couldtrainasetofclassifiers foragivensetof frequentqueries.However, theyarenotapplicable for rarequeries. In [18], Ranjan et al. proposed a one-shot classifier learning scheme (Direct query classifier). The proposed one shot learning scheme enables direct design of a classifier for novel queries, without having any access to the annotated training data, i.e., classifiers are trained for a set of frequentqueries,andseamlesslyextendedfor therareandarbitraryqueries,asandwhenrequired. Theauthorshypothesize thatword images, even ifdegraded, canbematchedandretrievedeffectively withaclassifierbasedsolution.Aproperly trainedclassifiercanyieldanaccurate rankedlistofwords since the classifier looksat thewordasawhole, andusesa larger context (saymultiple examples) formatching. Theresultsof thismethodaresignificantsince (i) Itdoesnotuseany languagespecific post-processingfor improvingtheaccuracy. (ii)Evenfora language likeEnglish,whereOCRsare fairly advancedandengineering solutionswereperfected, the classifierbasedsolution is asgood, if not superior to thebestavailablecommercialOCRs . In thedirectqueryclassifier (DQC) scheme[18], theauthorsusedDTWdistance for indexingthe frequentmeanvectors. Since theDTWdistance is computationally slow, theauthorsdonotuseall the frequentmeanvectors for indexing. For comparing twoword images, DTWdistance typically takesonesecond[3]. This limits theefficiencyofDQC. Toovercomethis limitation, theauthorsused Euclideandistance for indexing. Theauthorsuse the top10(closest in termsofEuclideandistance) frequentmeanvectors for indexing. Since theDTWdistancebettercaptures thesimilaritiescompared toEuclideandistance forwordimageretrieval, this restricts theperformanceofDQC. Forspeed-up,DTWdistancehasbeenpreviouslyapproximated[19,20]usingdifferent techniques. In [20], the authors proposed a fast approximate DTW distance, in which, the DTW distance is approximated as a sumofmultipleweighted Euclidean distances. For a given set of sequences, 72
back to the  book Document Image Processing"
Document Image Processing
Title
Document Image Processing
Authors
Ergina Kavallieratou
Laurence Likforman-Sulem
Editor
MDPI
Location
Basel
Date
2018
Language
German
License
CC BY-NC-ND 4.0
ISBN
978-3-03897-106-1
Size
17.0 x 24.4 cm
Pages
216
Keywords
document image processing, preprocessing, binarizationl, text-line segmentation, handwriting recognition, indic/arabic/asian script, OCR, Video OCR, word spotting, retrieval, document datasets, performance evaluation, document annotation tools
Category
Informatik
Web-Books
Library
Privacy
Imprint
Austria-Forum
Austria-Forum
Web-Books
Document Image Processing