Seite - 201 - in Document Image Processing
Bild der Seite - 201 -
Text der Seite - 201 -
J. Imaging 2018,4, 32
Note that the sizeof the inputblock is set to 1×4 forProtocols 6.1 and3 (not 2×4), respectively.
Tofine-tune theseparameterswe justpickoutasetof2000 labeled images fromAcTiV-R, inwhich
190areusedasavalidationset.
Table6.Bestparameters for trainingthenetwork.
Parameters Values
MDLSTMSize 2,10and50
Feed-forwardSize 6and20
InputBlockSize 2×4
HiddenBlockSizes 1×4and1×4
Learnrate 10−4
Momentum 0.9
5.3. ExperimentalResults
Several experiments have been conducted using the AcTiV-D and AcTiV-R subsets. These
experimentscanbedivided into twocategories: Thefirstoneconcerns thecomparisonofoursystems
with tworecentmethods. Thesecondcategoryaimsatanalyzingtheeffectof increasingthe training
dataontheaccuracyof theLADItextdetector.
5.3.1.ComparisonwithOtherMethods
As proof of concept of the proposed benchmark, we compare our systemswith two recent
methods. ThefirstonewasproposedbyGaddouretal. [52] tobasicallydetectArabic texts innatural
scene images. Themainsteps involvedare:
• Pixel-colorclusteringusingk-means to formpairsof thresholds foreachRGBchannel.
• Creationofbinarymapforeachpairof thresholds.
• ExtractionofCCs.
• Preliminaryfilteringaccordingto“areastability”criterion.
• Secondfilteringbasedonasetof statisticalandgeometric rules.
• Horizontalmergingof theremainingcomponents to formtextlines.
Thesecondmethodwasput forwardbyIwataetal. [53] torecognizeartificialArabic text invideo
frames. Itoperatesas follows:
• Textlinesegmentation intowordsbythresholdinggapsbetweenCCs.
• Over-segmentationofcharacters intoprimitivesegments.
• Character recognition using 64-dimensional feature vector of chain code histogram and the
modifiedquadraticdiscriminant function.
• Word recognition by dynamic programming using total likelihood of characters as
objective function.
• Falseword reduction bymeasuring the average of the character likelihoods in aword and
comparing it toapredefinedthreshold.
Thedetectionsystemshavebeentrainedonthe training-set1ofTable4. Theevaluationhasbeen
doneonthe test set for thedetectionandrecognitiontasks. Table7presentsevaluationresultsof the
detectionprotocols in termsofprecision, recallandF-measure. Thebest resultsaremarkedinbold.
TheLADI systemscoresbest for all protocolswith anF-measurebetween0.73 and0.85 forAllSD
protocol (p4.4) andAljazeeraHDprotocol (p1) respectively. In contrast to theSysAthat represents
a fully heuristic-basedmethod, the LADI system increased the F-measure by 11% for Protocol 1.
ForProtocols 4.1, 4.2, 4.3 and4.4 (SDchannels), the results arehigher,withagainof, respectively,
11%,17%,14%and24%.Thisreflects theeffectivenessofusingamachine-learningsolutiontofilter the
resultsgivenbytheSWTalgorithm.TheGaddosystemhasstrongfragmentationandmissdetection
tendency asdepictedby its obtainednumerical results. Table 8presents evaluation results of the
201
zurück zum
Buch Document Image Processing"
Document Image Processing
- Titel
- Document Image Processing
- Autoren
- Ergina Kavallieratou
- Laurence Likforman-Sulem
- Herausgeber
- MDPI
- Ort
- Basel
- Datum
- 2018
- Sprache
- deutsch
- Lizenz
- CC BY-NC-ND 4.0
- ISBN
- 978-3-03897-106-1
- Abmessungen
- 17.0 x 24.4 cm
- Seiten
- 216
- Schlagwörter
- document image processing, preprocessing, binarizationl, text-line segmentation, handwriting recognition, indic/arabic/asian script, OCR, Video OCR, word spotting, retrieval, document datasets, performance evaluation, document annotation tools
- Kategorie
- Informatik