Seite - 181 - in Document Image Processing
Bild der Seite - 181 -
Text der Seite - 181 -
J. Imaging 2017,3, 62
Figure10.Examplesofnonlinear illuminationmodeldefect. (Left)original image. (Right)degraded
imagewith illuminationdefectappliedonthe leftborder.
4.UseofDocCreator forPerformanceEvaluationTasksorRetraining
Here,wedescriberapidlyhowDocCreatorwasusedbyotherresearchersandtheconclusions
theydrew.
4.1. PublishedResultsUsingDocCreator
4.1.1.Document ImageGenerationforPerformanceEvaluation
Thesegmentationsystemproposedby[36] isbasedonatexture featureextractionwithoutany
aprioriknowledgeonthephysicalandlogicaldocument layout. Toassess thenoiserobustnessof their
system, theyusedDocCreatorandappliedthecharacterdegradationmodel. From25simplifiedreal
document images, theygeneratedasemi-syntheticdatabaseof150document images. Thisdatabase is
madeupofseveral subsetswhere thedegradation levelsaredifferent. Theperformanceevaluations
presented in [36] highlight that the texturedescriptors are slightlyperturbedby thedegradations.
Whencharactersarehighlydisconnected(ouralgorithmhaserased importantcharacter inkareas),
adropof thesegmentationperformanceswasobserved.
DocCreatorwas alsousedduring the ICDARcontest: staff-line removal frommusical scores.
The3Ddistortionandthecharacterdegradationmodelswereusedinorder togenerateanextended
database fromthe1000 imagesof theMUSCIMAdatabase [13]. Asaresult, theextendeddatabase
contains6000semi-syntheticgrayscale imagesand6000semi-syntheticbinary images. Thisdatabase
hasbeenused in thesecondeditionof themusicscorecompetition ICDAR2013 [37]. Fiveparticipants
submittedeightmethods. Participantsweregivena trainingsetof4000semi-synthetic imagesand
then2000semi-synthetic images to test theirmethodson.Regardingtheresultsonthe3Ddistortion
set, thesubmittedmethodsseemlessrobust toglobaldistortionthanto thepresenceofsmall curves
andfolds. Formoredetailsabout theparticipants, themethodsandthecontestprotocol, refer to [37].
Thisdatabasehasalreadybecomeabenchmarkdatabase formusicaldocument imagesanalysisand
recognition,asstated in [53]. So far, thedatabasehas indeedbeenusedforbenchmarking inmultiple
scientificpublicationsaboutmusicaldocumentprocessingandrecognition[38,53–56]andeven in the
moregeneralfieldofmachine learning[57].
4.1.2.Document ImageGenerationforRetrainingTask
TheIAM-HistDB[58]databasecontains127handwrittenhistoricalmanuscript images together
with theirgroundtruth. Thisdatabaseconsistsof threesets: theSaintGall setcontaining60 images
(1.410 text lines) inLatin, theParzival set containing47 images (4.477 text lines) inMedievalGerman,
and theWashington set containing 20 images in English. The authors of [39] used the character
degradationmodel tocreate twoextendeddatabasesof the IAM-HistDB.Thefirstone iscomposed
of17.661 imagesdegradedwith the inkmodel. The1.524 images fromtheseconddatasethavebeen
181
zurück zum
Buch Document Image Processing"
Document Image Processing
- Titel
- Document Image Processing
- Autoren
- Ergina Kavallieratou
- Laurence Likforman-Sulem
- Herausgeber
- MDPI
- Ort
- Basel
- Datum
- 2018
- Sprache
- deutsch
- Lizenz
- CC BY-NC-ND 4.0
- ISBN
- 978-3-03897-106-1
- Abmessungen
- 17.0 x 24.4 cm
- Seiten
- 216
- Schlagwörter
- document image processing, preprocessing, binarizationl, text-line segmentation, handwriting recognition, indic/arabic/asian script, OCR, Video OCR, word spotting, retrieval, document datasets, performance evaluation, document annotation tools
- Kategorie
- Informatik