Web-Books
im Austria-Forum
Austria-Forum
Web-Books
Informatik
Document Image Processing
Seite - 90 -
  • Benutzer
  • Version
    • Vollversion
    • Textversion
  • Sprache
    • Deutsch
    • English - Englisch

Seite - 90 - in Document Image Processing

Bild der Seite - 90 -

Bild der Seite - 90 - in Document Image Processing

Text der Seite - 90 -

J. Imaging 2018,4, 41 kernelvalues for themodel. Thealternativeconvolutionalandmax-poolinglayersdothis jobperfectly. AnotherpartofDCNNisfullyconnectedlayerswhichcontainmultipleneurons, likethesimpleneural network ineach layer thatgetsahigh-level feature fromthepreviousconvolutional-pooling layerand computes theweights toclassify theobjectproperly. 2SWLPL]HU 5PVSURS $GDP 0D[ Figure1.Theschematicdiagramofdeepconvolutionalneuralnetwork(DCNN)architecture. 3.1.DCNNNotation Thedeepconvolutional neural network is a speciallydesignedneural network for the image processingwork. Themostof thecolor imagesarebeingrepresentedinthreedimensionsh×w×c, wherehrepresentsheight,wrepresents thewidthof theimageandcrepresents thenumberofchannels of the image. However, theDCNNcanonly takean imagewhichhas the sameheight andwidth. Sobefore feedingthe imageinDCNN,anormalizationprocesshas tofollowtoconvert the imagefrom h×w×csize tom×m×csizewheremrepresentsheightandwidthofanimage. TheDCNNdirectly takes the three-dimensionalnormalized image/matrixXasan inputandsupplies toconvolutional layerwhichhaskkernelsof sizen×n×p,wheren<mand p≤ c. Theconvolutional layerperforms themultiplicationbetweentheneighborsofaparticularelementofXwiththeweightsprovidedby thekernel togenerate thekdifferent featuremapsofsize l(m−n+1). Theconvolutional layer isoften followedbytheactivationfunctions. Rectifiedlinearunit (Relu)wasselectedasactivationfunction Ykl = f ( n ∑ i=1 Xi∗Wkil+Bkl ) (1) wherekdenotes the featuremaplayer,Y isamapofsize l× landWil isakernelweightof sizen×n, Bkl represents thebiasvalueand*represents the2Dconvolution. Thenextpoolinglayerworkstoreducethefeaturemapsbyapplyingmean,maxorminoperation overpl×pl localregionoffeaturemap,wherepl canvaryfrom2to5generally.DCNNshavemultiple consecutivelayersofconvolutional followedbypoolinglayersandeachconvolutional layer introduces a lotofunknownweight. Theback-propagationalgorithm—oneof the famous techniquesused in thesimpleneuralnetworktofindweightautomatically—hasbeenusedtofindtheunknownweights duringthe trainingphase. Theback-propagationupdates theweights tominimizea loss j(w)orerror withan iterativeprocessofgradientdescent thatcanbeexpressedas Wt+1=Wt−α∇E|j(Wt)|+μνt (2) Back-propagationalgorithmhelps to followadirection towardswhere thecost functiongives theminimum loss or error by updating theweights. The value α, called learning rate, helps to determine thestepsizeorchange in thepreviousweight. Theback-propagationcanbestuckat local minimumsometimes,whichcanbeovercomebymomentumμwhichaccumulatesavelocityvector ν in thedirection of continuous reduction of loss function. The error or loss of a network canbe 90
zurück zum  Buch Document Image Processing"
Document Image Processing
Titel
Document Image Processing
Autoren
Ergina Kavallieratou
Laurence Likforman-Sulem
Herausgeber
MDPI
Ort
Basel
Datum
2018
Sprache
deutsch
Lizenz
CC BY-NC-ND 4.0
ISBN
978-3-03897-106-1
Abmessungen
17.0 x 24.4 cm
Seiten
216
Schlagwörter
document image processing, preprocessing, binarizationl, text-line segmentation, handwriting recognition, indic/arabic/asian script, OCR, Video OCR, word spotting, retrieval, document datasets, performance evaluation, document annotation tools
Kategorie
Informatik
Web-Books
Bibliothek
Datenschutz
Impressum
Austria-Forum
Austria-Forum
Web-Books
Document Image Processing