Web-Books
in the Austria-Forum
Austria-Forum
Web-Books
Informatik
Document Image Processing
Page - 66 -
  • User
  • Version
    • full version
    • text only version
  • Language
    • Deutsch - German
    • English

Page - 66 - in Document Image Processing

Image of the Page - 66 -

Image of the Page - 66 - in Document Image Processing

Text of the Page - 66 -

J. Imaging 2018,4, 6 Figure5.Anexampleofarescoring lattice. 6. ExperimentResults Totrain theproposedholisticArabicOCRsystem,weuseda lexiconofaround356,000words selected from the newsdomainwith high coverage for theArabic Language. Using this lexicon, wegeneratedadatabaseof images for three fonts: SimplifiedArabic,TraditionalArabicandArabic Transparent, in300dpiwith fourdifferentsizes. To test the system, we used three different test datasets that represent different degrees ofchallenges: 1. Laserscannedtextdataset: Thisdataset iscomposedof1152singlewordstakenfromnewspaper articles andprinted in three fonts and fourdifferent sizes in two typesofqualities: cleanand firstcopy. 2. Recent computerizedbooksdataset: Adataset composedof10scannedpages fromdifferent recentcomputerizedbooks thatcontain2730words. 3. Oldun-computerizedbooks: Thisdataset consistsof10scannedpagescontain2276words from oldbooks thatare typewrittenwithnotwellknownfonts. Figure6 illustratessomeexamplesof thescannedimages. In thefirstexperiment,weevaluated our systemusing the laser scanned data set. Initially, we evaluated the systemon a single font. Thesystemwas trainedonasingle fontwithsinglesizebutwas testedonthesamefontwithdifferent sizes. Wedidn’t use the languagemodelwith this dataset as it consists of singlewords. Table 2 illustrates theWordRecognitionRate (WRR)results for thisexperiment. Figure6.Somesamplesof thescannedimages. 66
back to the  book Document Image Processing"
Document Image Processing
Title
Document Image Processing
Authors
Ergina Kavallieratou
Laurence Likforman-Sulem
Editor
MDPI
Location
Basel
Date
2018
Language
German
License
CC BY-NC-ND 4.0
ISBN
978-3-03897-106-1
Size
17.0 x 24.4 cm
Pages
216
Keywords
document image processing, preprocessing, binarizationl, text-line segmentation, handwriting recognition, indic/arabic/asian script, OCR, Video OCR, word spotting, retrieval, document datasets, performance evaluation, document annotation tools
Category
Informatik
Web-Books
Library
Privacy
Imprint
Austria-Forum
Austria-Forum
Web-Books
Document Image Processing