Seite - 184 - in Document Image Processing
Bild der Seite - 184 -
Text der Seite - 184 -
J. Imaging 2017,3, 62
detailed in this paper showsemi-synthetic and synthetic documents createdwithDocCreator are
useful forperformanceevaluation, retrainingtasksorperformanceprediction. In futurework,weplan
to improve thesyntheticdocumentcreation toavoid tohave toodifferentcharacters in thecomposed
document. Forexample,weshould investigate if addingsomeconstraintsonthe fontextractionphase
or taking intoaccount thecontextwhenaddingnewcharacters to thesyntheticdocumentmayleadto
morerealistic syntheticdocuments.Wealsoconsider tosetupacognitiveexperiment toevaluate the
perceivedrealnessof thedegradeddocumentsoreventhecreatedsyntheticdocuments.Wearealso
planningto investigatehowthegenerationofhighlydiversifieddatacan improvetheresultsof tasks
basedondeeplearningmethods.
DocCreator (source,Linux,Mac,Windowspackagedversions),all thedatabasesusedfor thetests,
avideoandanextradatabase (31.000synthetic imagesgeneratedwithWilliamShakespearesonnet
textfiles)areavailableat [http://doc-creator.labri.fr/].
AuthorContributions:Nicholas Journet,MurielVisaniandBorisMansencalcontributed inequalproportionto
thecreationandtestsofDocCreator (modeldegradationandsyntheticdocumentreconstruction) ; theyalsowrote
thisarticle.KieuVan-Cuongcontributedto thedegradationmodels,AntoineBillycontributedto thesynthetic
documentreconstruction.
Conflictsof Interest:Theauthorsdeclarenoconflictof interest.
References
1. L’AffaireAlexis. Availableonline: http://gallica.bnf.fr/ark:/12148/bpt6k8630878m/f1.item.texteImage
(accessedon9December2017).
2. Shahab,A.;Shafait,F.;Kieninger,T.;Dengel,A. AnOpenApproachTowards theBenchmarkingofTable
StructureRecognition Systems. InProceedings of the 9th IAPR InternationalWorkshoponDocument
AnalysisSystems,Boston,MA,USA,9–11 June2010;ACM:NewYork,NY,USA,2010;pp. 113–120.
3. Lazzara, G.; Levillain, R.; Géraud, T.; Jacquelet, Y.; Marquegnies, J.; Crépin-Leblond, A. The SCRIBO
Moduleof theOlenaPlatform: aFreeSoftwareFrameworkforDocument ImageAnalysis. InProceedings
of the 2011 International Conference onDocumentAnalysis andRecognition (ICDAR), Beijing, China,
18–21September2011.
4. Yalniz, I.;Manmatha,R. AFastAlignmentSchemeforAutomaticOCREvaluationofBooks. InProceedings
of the 2011 International Conference onDocumentAnalysis andRecognition (ICDAR), Beijing, China,
18–21September2011;pp. 754–758.
5. Roy,P.;Ramel,J.;Ragot,N.WordRetrievalinHistoricalDocumentUsingCharacter-Primitives. InProceedings
of the 2011 International Conference onDocumentAnalysis and Recognition (ICDAR), Beijing, China,
18–21September2011;pp. 678–682.
6. IAMHandwritingDatabase.Availableonline: http://www.iam.unibe.ch/fki/databases/iam-handwriting-
database (accessedon9December2017).
7. Grosicki, E.;Carré,M.;Brodin, J.M.;Geoffrois,E. Resultsof the secondRIMESevaluationcampaign for
handwrittenmail processing. In Proceedings of the 2009 10th International Conference onDocument
AnalysisandRecognition(ICDAR),Barcelona,Spain,26–29 July2009.
8. Perez, D.; Tarazon, L.; Serrano, N.; Castro, F.; Terrades, O.R.; Juan, A. The GERMANA Database.
InProceedingsof the200910thInternationalConferenceonDocumentAnalysisandRecognition(ICDAR),
Barcelona,Spain,26–29 July2009; IEEEComputerSociety:Washington,DC,USA,2009;pp. 301–305.
9. Nakagawa,K.;Fujiyoshi,A.;Suzuki,M.Ground-truthedDatasetofChemicalStructure Images in Japanese
PublishedPatentApplications. In Proceedings of the 9th IAPR InternationalWorkshoponDocument
AnalysisSystems,Boston,MA,USA,9–11 June2010;ACM:NewYork,NY,USA,2010;pp. 455–462.
10. Eurecom.Availableonline: http://www.eurecom.fr/huet/work.html (accessedon9December2017).
11. UniversityofCalifornia,SanFrancisco. TheLegacyTobaccoDocumentLibrary (LTDL);UniversityofCalifornia:
SanFrancisco,CA,USA,2007.
12. Delalandre,M.;Valveny,E.;Pridmore,T.;Karatzas,D. GenerationofSyntheticDocuments forPerformance
EvaluationofSymbolRecognition&SpottingSystems. Int. J.Doc.Anal. Recognit. 2010,13, 187–207.
184
zurück zum
Buch Document Image Processing"
Document Image Processing
- Titel
- Document Image Processing
- Autoren
- Ergina Kavallieratou
- Laurence Likforman-Sulem
- Herausgeber
- MDPI
- Ort
- Basel
- Datum
- 2018
- Sprache
- deutsch
- Lizenz
- CC BY-NC-ND 4.0
- ISBN
- 978-3-03897-106-1
- Abmessungen
- 17.0 x 24.4 cm
- Seiten
- 216
- Schlagwörter
- document image processing, preprocessing, binarizationl, text-line segmentation, handwriting recognition, indic/arabic/asian script, OCR, Video OCR, word spotting, retrieval, document datasets, performance evaluation, document annotation tools
- Kategorie
- Informatik