Web-Books
im Austria-Forum
Austria-Forum
Web-Books
Informatik
Document Image Processing
Seite - 190 -
  • Benutzer
  • Version
    • Vollversion
    • Textversion
  • Sprache
    • Deutsch
    • English - Englisch

Seite - 190 - in Document Image Processing

Bild der Seite - 190 -

Bild der Seite - 190 - in Document Image Processing

Text der Seite - 190 -

J. Imaging 2018,4, 32 2. LiteratureReview Recently, several approacheshavebeenproposed todetect and recognize texts invideos and natural scene images [1,2,15,16]. Allmentionedwork so far are dedicated to Latin or Chinese text detection and recognition methods. Much of the progress that has beenmade in this field of research is attributed to the availability of standarddatasets. Themost popular of these is thedataset of ICDAR2003Robust Reading Competitions (RRC) [17], prepared for scene text localization, character segmentation (removingbackgroundpixels) andwordrecognition. Thisdataset includes509 text images in real environmentscapturedwithhand-helddevices. 258imagesfromthedatabaseareusedfortrainingand theremaining251 imagesconstitute the test set. Someexamplesaredepicted inFigure2a. Thisdataset wasalsoused in the ICDAR2005TextLocatingCompetition [18]. Figure3 shows theevolutionof theLatin textdetectionresearchbetween2003and2013 [18–20] takingasabenchmark the ICDAR 2003dataset. As canbeobserved, themethodofHuanget al. [19] outperformsother approaches bya largemargin. Thismethodenhances theStrokeWidthTransform(SWT)algorithmusingcolor information and introduces Text CovarianceDescriptors (TCDs). For theword-recognition task, thebestaccuracyof93.1%,wasachievedbyJaderbergetal. [21]usingtheirproposedConvolutional Neural Networks (CNN) model. The dataset in ICDAR 2011 RRC [22] was inherited from the benchmark used in the previous ICDAR competitions (i.e., 2003 and 2005) but have undergone extensionandmodification, since thereare somemissingground truth informationand imprecise wordboundingboxes. Thefinaldatasetsconsistedof485 full imagesand1564croppedwordimages for localizationandword-recognition tasks, respectively.Onthisdataset, the textdetectionmethodof Liaoetal. [23]obtainsstate-of-the-artperformancewithanF-scoreof82%.Thisalgorithmisbasedon afullyconvolutionalnetwork(FCN)followedbyastandardnon-maximumsuppressionprocess. a b c d Figure2.Typical samples fromICDAR2003(a),MSRA-TD500(b),NEOCR(c) andKAIST(d)datasets. In the2013editionof ICDARRRC[24], anewdatabasewasproposedforvideotextdetection, trackingandrecognition. It contains28shortvideosequences.Anupdatedversionof thisdatasetwas providedinICDAR2015[25] includingatrainingsetof25videosandatest setof24videos. TheMSRA-TD500dataset[26]worksonmulti-orientedscenetextsdetection.Thisdatasetincludes 500 images (300for trainingand200for testing)withhorizontalandslant/skewedtexts incomplex natural scenes (seeFigure 2b for examples). ThemethodofLiuet al. [27] achieves state-of-the-art performanceonthisdatabasewithanF-scoreof75%.Thismethodmakesuseof theMaximallyStable ExtremalRegions (MSER) techniqueas textcandidatesextractoraswellasasetofheuristic rulesand anAdaBoostclassifierasa two-stagesfilteringprocess. The Street View Text (SVT) dataset [28] is used for scene text detection, segmentation and recognition inoutdoor images. It includes350full imageswith904word-levelannotatedbounding boxes. Themethod of Shi et al. [29] shows superiority over existing techniqueswith 80.8% as a recognitionaccuracy. Thismethod isbasedonConvolutionalRecurrentNeuralNetwork (CRNN), which integrates the advantages of both CNN and Recurrent Neural Networks (RNN). For the 190
zurück zum  Buch Document Image Processing"
Document Image Processing
Titel
Document Image Processing
Autoren
Ergina Kavallieratou
Laurence Likforman-Sulem
Herausgeber
MDPI
Ort
Basel
Datum
2018
Sprache
deutsch
Lizenz
CC BY-NC-ND 4.0
ISBN
978-3-03897-106-1
Abmessungen
17.0 x 24.4 cm
Seiten
216
Schlagwörter
document image processing, preprocessing, binarizationl, text-line segmentation, handwriting recognition, indic/arabic/asian script, OCR, Video OCR, word spotting, retrieval, document datasets, performance evaluation, document annotation tools
Kategorie
Informatik
Web-Books
Bibliothek
Datenschutz
Impressum
Austria-Forum
Austria-Forum
Web-Books
Document Image Processing