Web-Books
im Austria-Forum
Austria-Forum
Web-Books
Informatik
Document Image Processing
Seite - 94 -
  • Benutzer
  • Version
    • Vollversion
    • Textversion
  • Sprache
    • Deutsch
    • English - Englisch

Seite - 94 - in Document Image Processing

Bild der Seite - 94 -

Bild der Seite - 94 - in Document Image Processing

Text der Seite - 94 -

J. Imaging 2018,4, 41 Table1.Variousnetworkarchitecturesofdeepconvolutionalneuralnetworkused. Network ModelArchitectures NA-1 64IN64-64C2-Relu-4P2-500FC-47OU NA-2 64IN64-64C2-Relu-4P2-1000FC-47OU NA-3 64IN64-32C2-Relu-4P2-32C2-Relu-4P2-1000FC-47OU NA-4 64IN64-64C2-Relu-4P2-64C2-Relu-4P2-1000FC-47OU NA-5 64IN64-32C2-Relu-4P2-32C2-Relu-4P2-32C2-Relu-4P2-1000FC-47OU NA-6 64IN64-64C2-Relu-4P2-64C2-Relu-4P2-64C2-Relu-4P2-1000FC-47OU The experiments were all executed on the ParamShavak supercomputer system having twomulticoreCPUswitheachCPUconsistingof12coresalongwithtwoacceleratorcards. Thissystem has64GBRAMwithCentOs6.5operatingsystem. Thedeepneuralnetworkmodelwascoded in PythonusingKeras—ahigh-levelneural networkAPI thatusesTheanoPython library. Thebasic pre-processingtasks likebackgroundelimination,gray-normalizationandimageresizingweredone inMatlab. ISIDCHARandV2DMDCHARdatabases. The ISIDCHAR [26]was prepared by researchers of the Indian Statistical Institute, Kolkata. They collected the samples frompersons of different age groups to accommodate themaximum variationofwritten characters. Apart from that, the samples are also collected from thefilled job formsandpost-cards thatmakes thisdatabasesorealistic. Thisdatabaseconsistsof36,172grayscale imagesof47differentDevanagari characters.Owingto theassemblageofsamples frommanyauthors, thisdatabasedeliversavarietyof samples ineachclass, andthebackgroundof thesamples isalso highlyuninformed.V2DMDCHAR[31]hasbeenpreparedbyVikas J.DongreandVijayH.Mankar’s in2012. Thisdatabasehas20,305samplesofhandwrittenDevanagari characters. 4.1. ExperimentalSetup Theexperimentswereperformed to investigate the effects of different networkarchitectures, optimizers,andlayer-wise trainings. Thefirstphaseofexperimentswasperformedtoobserve thebest networkarchitecture for thedatabase, and then thebest-observednetworkarchitecturewas tested withsixdifferentoptimizers tofindthebestoptimizer.Atotalof12 (6+6)differentexperimentswere performedonthedatabase. Thesecondphaseofexperimentsaimedtoobserve theeffectof layer-wise training. The layer-wise trainingwasonlyperformedwith thebestnetworkarchitecture andbest optimizerselected in thefirstphase. Eachoptimizerhaditsownsetofparameters. Inourexperiments, theoptimizerparameterswere keptasper theirdefaultvaluesorassuggestedbytheauthor. Therectifiedlinearactivationfunction wasusedforentireexperiments tomitigate thegradientvanishingproblem.Thesumofsquaresof thedifferencebetween target andobservedvalueswas calculated to estimate the loss of thedeep network. Eachnetworkwas trainedfor100epochsusingmini-batchesofsize200. 4.2. Results ThefirstphaseofexperimentswasperformedonISIDCHARtoexaminethebestdeepnetwork architecture.WerecordedtherecognitionaccuracyatdifferentnetworkarchitectureusingtheAdam optimizer during each of the 50 epochs. The results in terms of themaximum,minimum,mean, andstandarddeviationvaluesof recognitionaccuracyarereported inTable2. Thebest recognitionaccuracywasobtainedwith thenetworkarchitectureNA-6,andthe least recognitionaccuracywasobtainedwith thenetworkarchitectureNA-1. Figure3showstheobtained recognitionaccuracyateachepoch. ThenetworkNA-1produced85%recognitionaccuracybecause it has only one convolutional layer. The networkNA-3 andNA-5 produced higher recognition accuraciesof91.53%and93.24%respectivelybecause thesenetworkshaveamoreconvolutional layer. Thisenhancementsignifies that the incrementof theconvolutional layer indeepconvolutionalneural networkproducedbest results. Inourexperiments,weobservedtheenhancement in therecognition 94
zurück zum  Buch Document Image Processing"
Document Image Processing
Titel
Document Image Processing
Autoren
Ergina Kavallieratou
Laurence Likforman-Sulem
Herausgeber
MDPI
Ort
Basel
Datum
2018
Sprache
deutsch
Lizenz
CC BY-NC-ND 4.0
ISBN
978-3-03897-106-1
Abmessungen
17.0 x 24.4 cm
Seiten
216
Schlagwörter
document image processing, preprocessing, binarizationl, text-line segmentation, handwriting recognition, indic/arabic/asian script, OCR, Video OCR, word spotting, retrieval, document datasets, performance evaluation, document annotation tools
Kategorie
Informatik
Web-Books
Bibliothek
Datenschutz
Impressum
Austria-Forum
Austria-Forum
Web-Books
Document Image Processing