Seite - 41 - in Document Image Processing
Bild der Seite - 41 -
Text der Seite - 41 -
J. Imaging 2018,4, 80
experimentsonhandwrittendocuments,beforeandafterslant removal. Therecognitionresults
for thehandwritingofourdatabaseswerea failure,duetohavinghistoricaldocumentsor/and
languagesother thanEnglish. For thePrintDBdatabase, inFigure13, thecharactererrorratevs.
theartificial slantareshown,asobtainedbyacommercialOCRsystem(AdobeAcrobat).
For a page slant of less than −19 or greater than 37 degrees, it is very difficult to find
acorrespondencebetweenthecharactersoftheimageandtheOCRresult. Thecorrectedversionisvery
wellhandled(0degrees,English).Moreover, theOCRsoftwarehandles right-slantedcharactersbetter
than left-slantedcharacters. This is likelyduetoextra trainingfor italics in thecommercial software.
Thecomputationalcost is less than5s forahistoricaldocument imageofsizea littlebigger than
A4andresolution600dpi inacomputerwithprocessor Intel(R)Core(TM) i5-4210UCPU@1.70GHz
2.40GHz.
Since the proposed systemwas built in order to help our free-segmentationword-spotting
system[25],weprovideanexampleofwordspotting taskonhandwrittendocuments. It isworth
mentioningthatanimprovementoftherecallofat least20%isobserved(Figure14)forthe20document
imagesofGeorgeWashingtonDBand100queries. The improvementappearsextremelyhigh,which
may be a result of the query being a part of the same page or collection. However, for slanted
characters, theslantdegree isnotalwaysfixedandinmanycases, theslantedcharactersoverlapwith
othercharacters.
0
0.2
0.4
0.6
0.8
1
Figure13.Charactererrorratevs. thedegreeofpageslantasperformedbyacommercialOCRsystem.
41
zurück zum
Buch Document Image Processing"
Document Image Processing
- Titel
- Document Image Processing
- Autoren
- Ergina Kavallieratou
- Laurence Likforman-Sulem
- Herausgeber
- MDPI
- Ort
- Basel
- Datum
- 2018
- Sprache
- deutsch
- Lizenz
- CC BY-NC-ND 4.0
- ISBN
- 978-3-03897-106-1
- Abmessungen
- 17.0 x 24.4 cm
- Seiten
- 216
- Schlagwörter
- document image processing, preprocessing, binarizationl, text-line segmentation, handwriting recognition, indic/arabic/asian script, OCR, Video OCR, word spotting, retrieval, document datasets, performance evaluation, document annotation tools
- Kategorie
- Informatik