Page - 129 - in Document Image Processing
Image of the Page - 129 -
Text of the Page - 129 -
J. Imaging 2018,4, 15
true forSpanishdocuments fromthe16thcenturyasseen inFigure1.Ancient textsalso includerare
characters,grammatical forms,wordspellingsandnamedentitiesdistinct frommodernones. Such
forms lead toOut-Of-Vocabulary (OOV)words, i.e.,words thatdonotbelong to thedictionaryof
theHTRsystem. ImprovingHTRsystemsatboth imageand language levels isan important issue
for the recognitionof suchancient historical documents. Themaingoal of this paper is todesign
efï¬cientHTRsystemsthatprocessdocument imageswritten inSpanishandthatcancopewithancient
character formsandlanguage.
Figure1.Sample imageofaSpanishdocument fromthe16thcentury.
Several approacheshavebeenproposed tobuildopticalmodels for handwriting recognition.
Suchapproaches includeHiddenMarkovModels (HMMs) [1â4],RecurrentNeuralNetworks (RNNs)
suchasLongShort-TermMemory(LSTMs)andtheirvariants: Bi-directionalLSTMs(BLSTMs)and
Multi-DimensionalLSTMs(MDLSTMs) [5].HMMsenableembeddedtrainingandcanberobust to
noiseandlineardistortions.However,RNNsandtheirvariantsaregenerativemodels thatperform
better thanHMMsintermsofaccuracy.Nowadays,RNNscanbetrainedbyusingdedicatedresources
such asGraphic ProcessorUnits (GPUs) that considerably reduce training time. ByusingGPUs,
RNNscanbe trained inasimilar amountof timerequired to trainHMMswith traditionalCentral
ProcessingUnits (CPUs).
Usually, the inputsofHMMsandRNNsaresequencesofhandcraftedfeaturesorpixel columns.
However, deep learning approaches starting with convolutional layers as the ï¬rst layers allow
extracting learning-basedfeatures insteadofhandcraftedones [6â8].
Generally, inHTRsystems, theopticalmodelsareassociatedwithdictionaries (lexicalmodels)
andLanguageModels (LMs), usually at theword level, in order to direct the recognition of real
words andplausibleword sequences (see Figure 2). In order to build open vocabulary systems,
languagemodelsbasedoncharacterunitscanbeused[9]. Then, thedictionary is limitedto theset
129
back to the
book Document Image Processing"
Document Image Processing
- Title
- Document Image Processing
- Authors
- Ergina Kavallieratou
- Laurence Likforman-Sulem
- Editor
- MDPI
- Location
- Basel
- Date
- 2018
- Language
- German
- License
- CC BY-NC-ND 4.0
- ISBN
- 978-3-03897-106-1
- Size
- 17.0 x 24.4 cm
- Pages
- 216
- Keywords
- document image processing, preprocessing, binarizationl, text-line segmentation, handwriting recognition, indic/arabic/asian script, OCR, Video OCR, word spotting, retrieval, document datasets, performance evaluation, document annotation tools
- Category
- Informatik