Web-Books
in the Austria-Forum
Austria-Forum
Web-Books
Informatik
Document Image Processing
Page - 111 -
  • User
  • Version
    • full version
    • text only version
  • Language
    • Deutsch - German
    • English

Page - 111 - in Document Image Processing

Image of the Page - 111 -

Image of the Page - 111 - in Document Image Processing

Text of the Page - 111 -

J. Imaging 2018,4, 43 Thenetworkconsistsof threesetsofconvolutionandmaxpoolingpairs.All convolutional layersuse a strideofoneandarezeropaddedso that theoutput is the samesizeas the input. Theoutputof eachconvolutional layer isactivatedusingtheReLufunctionandfollowedbyamaxpoolingof2×2 blocks. Thenumbersof featuremaps (of size5×5)used in the threeconsecutiveconvolutional layers are8, 16, and32, respectively. Theoutputof the last layers isflattened,andafully-connected layer with1024neurons (alsoactivatedwithReLu) is added, followedby the last output layer (softmax activation)consistingofNclassneurons,whereNclass is thenumberofcharacterclasses.Dropoutwith probabilityp=0.5 isappliedbefore theoutput layer topreventoverfitting.Wetrainedthenetwork usinganAdamoptimizerwithabatchsizeof100anda learningrateof0.0001. Figure10.Architectureof theCNN. 3.4.WordRecognitionandTransliteration Inorder tomakethepalmleafmanuscriptsmoreaccessible, readable,andunderstandable toa wideraudience,anoptical character recognition(OCR)systemshouldbedeveloped. InmanyDIA systems,wordor text recognition is thefinal task in theprocessingpipeline.However,normally in SoutheastAsianscript thespeechsoundof thesyllablechange is relatedtosomecertainphonological rules. In thiscase,anOCRsystemisnotenough. Therefore,a transliterationsystemshouldalsobe developedtohelp transliterate theancientscriptsonthesemanuscripts. Bydefinition, transliteration is defined as the process of obtaining the phonetic translation of names across languages [54]. Transliteration involves renderinga language fromonewritingsystemtoanother. In [54], theproblem isstatedformallyasasequencelabelingproblemfromonelanguagealphabet toanother. Itwillhelpus to indexandtoquicklyandefficientlyaccess thecontentof themanuscripts. Inourpreviouswork[29], acompleteschemeforsegmentation-basedglyphrecognitionandtransliterationspecific toBalinese palmleafmanuscriptswasproposed. In thiswork,asegmentation-freemethodwillbeevaluatedto recognizeandtransliterate thewords fromthreedifferentscriptsofapalmleafmanuscript. RNN/LSTM-BasedMethods Fromthe lastdecade, sequence-analysis-basedmethodsusingaRecurrentNeuralNetwork-Long Short-TermMemory(RNN-LSTM)typeoflearningnetworkhavebeenverypopularamongresearchers in text recognition. RNN-LSTM-basedmethodtogetherwithaConnectionistTemporalClassification (CTC)worksasasegmentation-free learning-basedmethodtorecognize thesequenceofcharacters in aword or textwithout any handcrafted feature extractionmethod. The raw image pixel can be sent directly as the input to the learning network and there is no requirement to segment the trainingdatasequence. RNNisbasicallyanextendedversionof thebasic feedforwardneuralnetwork. In a RNN, the neurons in the hidden layer are connected to each other. RNNoffers very good context-awareprocessing torecognizepatterns inasequenceor timeseries.OnedrawbackofRNNis thevanishinggradientproblem. Todealwith thisproblem, theLSTMarchitecturewas introduced. TheLSTMnetworkaddsmultiplicativegatesandadditive feedback. BidirectionalLSTMisanLSTM 111
back to the  book Document Image Processing"
Document Image Processing
Title
Document Image Processing
Authors
Ergina Kavallieratou
Laurence Likforman-Sulem
Editor
MDPI
Location
Basel
Date
2018
Language
German
License
CC BY-NC-ND 4.0
ISBN
978-3-03897-106-1
Size
17.0 x 24.4 cm
Pages
216
Keywords
document image processing, preprocessing, binarizationl, text-line segmentation, handwriting recognition, indic/arabic/asian script, OCR, Video OCR, word spotting, retrieval, document datasets, performance evaluation, document annotation tools
Category
Informatik
Web-Books
Library
Privacy
Imprint
Austria-Forum
Austria-Forum
Web-Books
Document Image Processing