Web-Books
im Austria-Forum
Austria-Forum
Web-Books
Informatik
Document Image Processing
Seite - 91 -
  • Benutzer
  • Version
    • Vollversion
    • Textversion
  • Sprache
    • Deutsch
    • English - Englisch

Seite - 91 - in Document Image Processing

Bild der Seite - 91 -

Bild der Seite - 91 - in Document Image Processing

Text der Seite - 91 -

J. Imaging 2018,4, 41 foundbyvarious functions. Thesumofsquares functionusedtocalculate the lossorerror thatcanbe expressedas j(w)= N ∑ n=1 (yn− yˆn)2+λ L ∑ l=1 W2l (3) AnL2regularizationλwasappliedduringthecomputationof loss toavoid the largeprogressof theparametersat the timeof theminimizationprocess. TheentirenetworkofDCNNinvolves themultiple layersofconvolutional,pooling, relu, fully connectedandSoftmax. These layershaveadifferent specification toexpress theminaparticular network. In thispaper,weusedaspecial conventiontoexpress thenetworkofDCNN. • xINy:Aninput layerwherexrepresents thewidthandheightof the imageandyrepresent the numberofchannels. • xCy:Aconvolutional layerwherexrepresentsanumberofkernelsandyrepresents thesizeof kernely*y. • xPy:Apooling layerwherexrepresentspoolingsizex*x,andyrepresentspoolingstride. • Relu:Represents rectifiedlayerunit. • xDrop:Adropout layerwherexrepresents theprobabilityvalue. • xFC:Afullyconnectedordense layerwherexrepresentsanumberofneurons. • xOU:Aoutput layerwherexrepresentsclassesor labels. 3.2.DifferentAdaptiveGradientMethods Basically, theneuralnetwork trainingupdates theweights ineach iteration, andthefinalgoal of training is tofindtheperfectweight thatgives theminimumlossorerror. Oneof the important parameters of thedeepneural network is learning rate,whichdecides the change in theweights. Theselectionofvalueforlearningrateisaverychallengingtaskbecauseif thevalueofthelearningrate selects low, thentheoptimizationcanbeveryslowandanetworkwill take timetoreachtheminimum lossorerror.Ontheotherhand, if thevalueof learningrateselectshigher, thentheoptimizationcan deviate and thenetworkwill not reach theminimumlossor error. Thisproblemcanbe solvedby theadaptivegradientmethods thathelp in faster trainingandbetterconvergence. TheAdagrad[27] (adaptivegradient) algorithmwas introducedbyDuchi in2011. It automatically incorporates low andhigh update for frequent and infrequent occurring features respectively. Thismethod gives an improvement inconvergenceperformanceascomparedtostandardstochasticgradientdescent for thesparsedata. It canbeexpressedas, Wt+1=Wt− α√ ∑tAvt2+ gt (4) whereAvt is thepreviousadjustmentgradientand isusedtoavoiddividebyzeroproblems. TheAdagradmethoddivides the learningratebythesumof thesquaredgradient thatproduces asmall learningrate. ThisproblemissolvedbytheAdadeltamethod[28] thatcanonlyaccumulate a fewpastgradients in spiteof entirepastgradients. Theequationof theAdadeltamethodcanbe expressedas Wt+1=Wt− α√ E[Av]2+ gt (5) whereE[Av]2 representsentirepastgradients. Itdependsoncurrentgradientandthepreviousaverage of thegradient. TheproblemofAdagradissolvedbyHinton[29]bythe techniquecalledRMSProp, whichwasdesignedforstochasticgradientdescent. RMSPropisanupdatedversionofRpropwhich didnotworkwithmini-batches. Rprop is sameas the gradient, but it alsodivides by the size of thegradient. RMSPropkeepsamovingaverageof thesquaredgradient foreachweightand, further, 91
zurück zum  Buch Document Image Processing"
Document Image Processing
Titel
Document Image Processing
Autoren
Ergina Kavallieratou
Laurence Likforman-Sulem
Herausgeber
MDPI
Ort
Basel
Datum
2018
Sprache
deutsch
Lizenz
CC BY-NC-ND 4.0
ISBN
978-3-03897-106-1
Abmessungen
17.0 x 24.4 cm
Seiten
216
Schlagwörter
document image processing, preprocessing, binarizationl, text-line segmentation, handwriting recognition, indic/arabic/asian script, OCR, Video OCR, word spotting, retrieval, document datasets, performance evaluation, document annotation tools
Kategorie
Informatik
Web-Books
Bibliothek
Datenschutz
Impressum
Austria-Forum
Austria-Forum
Web-Books
Document Image Processing