Web-Books
in the Austria-Forum
Austria-Forum
Web-Books
Informatik
Document Image Processing
Page - 181 -
  • User
  • Version
    • full version
    • text only version
  • Language
    • Deutsch - German
    • English

Page - 181 - in Document Image Processing

Image of the Page - 181 -

Image of the Page - 181 - in Document Image Processing

Text of the Page - 181 -

J. Imaging 2017,3, 62 Figure10.Examplesofnonlinear illuminationmodeldefect. (Left)original image. (Right)degraded imagewith illuminationdefectappliedonthe leftborder. 4.UseofDocCreator forPerformanceEvaluationTasksorRetraining Here,wedescriberapidlyhowDocCreatorwasusedbyotherresearchersandtheconclusions theydrew. 4.1. PublishedResultsUsingDocCreator 4.1.1.Document ImageGenerationforPerformanceEvaluation Thesegmentationsystemproposedby[36] isbasedonatexture featureextractionwithoutany aprioriknowledgeonthephysicalandlogicaldocument layout. Toassess thenoiserobustnessof their system, theyusedDocCreatorandappliedthecharacterdegradationmodel. From25simplifiedreal document images, theygeneratedasemi-syntheticdatabaseof150document images. Thisdatabase is madeupofseveral subsetswhere thedegradation levelsaredifferent. Theperformanceevaluations presented in [36] highlight that the texturedescriptors are slightlyperturbedby thedegradations. Whencharactersarehighlydisconnected(ouralgorithmhaserased importantcharacter inkareas), adropof thesegmentationperformanceswasobserved. DocCreatorwas alsousedduring the ICDARcontest: staff-line removal frommusical scores. The3Ddistortionandthecharacterdegradationmodelswereusedinorder togenerateanextended database fromthe1000 imagesof theMUSCIMAdatabase [13]. Asaresult, theextendeddatabase contains6000semi-syntheticgrayscale imagesand6000semi-syntheticbinary images. Thisdatabase hasbeenused in thesecondeditionof themusicscorecompetition ICDAR2013 [37]. Fiveparticipants submittedeightmethods. Participantsweregivena trainingsetof4000semi-synthetic imagesand then2000semi-synthetic images to test theirmethodson.Regardingtheresultsonthe3Ddistortion set, thesubmittedmethodsseemlessrobust toglobaldistortionthanto thepresenceofsmall curves andfolds. Formoredetailsabout theparticipants, themethodsandthecontestprotocol, refer to [37]. Thisdatabasehasalreadybecomeabenchmarkdatabase formusicaldocument imagesanalysisand recognition,asstated in [53]. So far, thedatabasehas indeedbeenusedforbenchmarking inmultiple scientificpublicationsaboutmusicaldocumentprocessingandrecognition[38,53–56]andeven in the moregeneralfieldofmachine learning[57]. 4.1.2.Document ImageGenerationforRetrainingTask TheIAM-HistDB[58]databasecontains127handwrittenhistoricalmanuscript images together with theirgroundtruth. Thisdatabaseconsistsof threesets: theSaintGall setcontaining60 images (1.410 text lines) inLatin, theParzival set containing47 images (4.477 text lines) inMedievalGerman, and theWashington set containing 20 images in English. The authors of [39] used the character degradationmodel tocreate twoextendeddatabasesof the IAM-HistDB.Thefirstone iscomposed of17.661 imagesdegradedwith the inkmodel. The1.524 images fromtheseconddatasethavebeen 181
back to the  book Document Image Processing"
Document Image Processing
Title
Document Image Processing
Authors
Ergina Kavallieratou
Laurence Likforman-Sulem
Editor
MDPI
Location
Basel
Date
2018
Language
German
License
CC BY-NC-ND 4.0
ISBN
978-3-03897-106-1
Size
17.0 x 24.4 cm
Pages
216
Keywords
document image processing, preprocessing, binarizationl, text-line segmentation, handwriting recognition, indic/arabic/asian script, OCR, Video OCR, word spotting, retrieval, document datasets, performance evaluation, document annotation tools
Category
Informatik
Web-Books
Library
Privacy
Imprint
Austria-Forum
Austria-Forum
Web-Books
Document Image Processing