Seite - 129 - in Document Image Processing
Bild der Seite - 129 -
Text der Seite - 129 -
J. Imaging 2018,4, 15
true forSpanishdocuments fromthe16thcenturyasseen inFigure1.Ancient textsalso includerare
characters,grammatical forms,wordspellingsandnamedentitiesdistinct frommodernones. Such
forms lead toOut-Of-Vocabulary (OOV)words, i.e.,words thatdonotbelong to thedictionaryof
theHTRsystem. ImprovingHTRsystemsatboth imageand language levels isan important issue
for the recognitionof suchancient historical documents. Themaingoal of this paper is todesign
efficientHTRsystemsthatprocessdocument imageswritten inSpanishandthatcancopewithancient
character formsandlanguage.
Figure1.Sample imageofaSpanishdocument fromthe16thcentury.
Several approacheshavebeenproposed tobuildopticalmodels for handwriting recognition.
Suchapproaches includeHiddenMarkovModels (HMMs) [1–4],RecurrentNeuralNetworks (RNNs)
suchasLongShort-TermMemory(LSTMs)andtheirvariants: Bi-directionalLSTMs(BLSTMs)and
Multi-DimensionalLSTMs(MDLSTMs) [5].HMMsenableembeddedtrainingandcanberobust to
noiseandlineardistortions.However,RNNsandtheirvariantsaregenerativemodels thatperform
better thanHMMsintermsofaccuracy.Nowadays,RNNscanbetrainedbyusingdedicatedresources
such asGraphic ProcessorUnits (GPUs) that considerably reduce training time. ByusingGPUs,
RNNscanbe trained inasimilar amountof timerequired to trainHMMswith traditionalCentral
ProcessingUnits (CPUs).
Usually, the inputsofHMMsandRNNsaresequencesofhandcraftedfeaturesorpixel columns.
However, deep learning approaches starting with convolutional layers as the first layers allow
extracting learning-basedfeatures insteadofhandcraftedones [6–8].
Generally, inHTRsystems, theopticalmodelsareassociatedwithdictionaries (lexicalmodels)
andLanguageModels (LMs), usually at theword level, in order to direct the recognition of real
words andplausibleword sequences (see Figure 2). In order to build open vocabulary systems,
languagemodelsbasedoncharacterunitscanbeused[9]. Then, thedictionary is limitedto theset
129
zurück zum
Buch Document Image Processing"
Document Image Processing
- Titel
- Document Image Processing
- Autoren
- Ergina Kavallieratou
- Laurence Likforman-Sulem
- Herausgeber
- MDPI
- Ort
- Basel
- Datum
- 2018
- Sprache
- deutsch
- Lizenz
- CC BY-NC-ND 4.0
- ISBN
- 978-3-03897-106-1
- Abmessungen
- 17.0 x 24.4 cm
- Seiten
- 216
- Schlagwörter
- document image processing, preprocessing, binarizationl, text-line segmentation, handwriting recognition, indic/arabic/asian script, OCR, Video OCR, word spotting, retrieval, document datasets, performance evaluation, document annotation tools
- Kategorie
- Informatik