Seite - 172 - in Document Image Processing

Bild der Seite - 172 -

Text der Seite - 172 -

J. Imaging 2017,3, 62 are reliablyannotated, copyright-free,up-to-dateoreasilyavailable todownload. Analternative for researchersanddigitalcurators is tocreate theirowngroundtruthbymanuallyannotatingdocument images. Inorder toassist theminthetedioustaskofgroundtruthcreation,multiplesoftwarehavebeen proposedduringthe last twodecades. Asdetailed inTable 1, someare fullymanual stand-alone software (PinkPanther (1998) [17], trueViz (2003) [18]), while others provide semi-automatic annotationmodules (GEDI (2010) [19], Aletheia(2011)[20,21]). Someofthemostrecentsolutionsarebasedonanonlinecollaborativeplatform (Transcriptorium(2014) [22],DIVADIAWI [23] (2015), [24] (2016),Recitalmanuscriptplatform[25] (2017)).Amongnonopen-sourcesolutions, somehaveanacademic licence: [20,26]. Thesesoftware assist theuser increating thegroundtruthassociatedwith realdocuments, intrinsically limited in numberbecauseofacquisitionproceduresandcopyright issues.Moreover,despite theuseof such software,manualannotationremainsacostly taskthatcannotalwaysbeperformedbyanon-specialist. Another solution is available for getting (quickly and with lower human cost) large ground-trutheddocument image datasets. This solution, investigated since the beginning of the nineties [27], is to generate synthetic imageswith controlledground truth. The authors of [28,29] propose two similar systems. They consist of using a text editor (e.g.,Word-ofﬁce, Latex, etc.) to automaticallycreatemultipledocumentswithvariedcontents (in termsof font,background, layout). Alternativeapproachesconsistof re-arranging, inanewway, elementsextracted fromreal images soas togenerate (manually, semi-automaticallyorautomatically)multiplesemi-syntheticdocument images [12,30]. Recently, inparticularwith theadventofdeep learning techniqueswhichrequirehuge masses of trainingdata, theneed for synthetic data generation seems tobe ever-growing. In [31], amongthe60,000characterpatchesthatwereusedtotrainaconvolutionalnetworkfortextrecognition, only3000werereal. In thispaperwepresentDocCreator,anopen-sourceandmulti-platformsoftware that isable to createvirtuallyunlimitedamountsofdifferentground-truthedsyntheticdocument imagesbasedon asmallnumberof real images. Table 1. Technical and functional characteristics of existing annotation software. Six features are presented: export format, source availability, desktop/online software, groundtruthing assistance (whether the software provides features that help the user to quickly create the groundtruth), collaborative/crowd-sourcingsoftware,andyearofdistribution. Export Open-Source Desktop/Online GroundtruthingAssistance Collaborative Year Softwareformanualgroundtruthcreation PinkPanther [17] ASCII n/a desktop no no 1998 TrueViz [18] XML yes desktop no no 2003 PerfectDoc[32] XML yes desktop ? no 2005 PixLabeler [33] XML no desktop no no 2009 GEDI[19] XML yes desktop yes no 2010 DAE[34] no yes online yes yes 2011 Aletheia [20,26] XML no online/desktop yes no 2011 Transcriptorium[22] TEI-XML no online yes yes 2014 DIVADIAWI[23] XML n/a online yes n/a 2015 Recital [25] no yes online yes yes 2017 Algorithmsforsyntheticdataaugmentation Bairdetal. [27] no n/a n/a n/a no 1990 Zhaoetal. [28] no n/a n/a n/a no 2005 Delalandreetal. [12] no n/a n/a n/a no 2010 Yinetal. [30] no n/a n/a n/a no 2013 Masetal. [24] no n/a n/a n/a yes 2016 Seuretetal. [35] no yes n/a n/a no 2015 Softwareforsemi-automaticgroundtruthcreationanddataaugmentationcapabilities DocCreator XML yes online/desktop yes no 2017 As illustrated in Figure 1, DocCreator can handle the creation of ground-truthed synthetic images froma limited set of real images. Various realistic degradationmodels canbe appliedon 172

zurück zum Buch Document Image Processing"

Document Image Processing

Titel: Document Image Processing
Autoren: Ergina Kavallieratou; Laurence Likforman-Sulem
Herausgeber: MDPI
Ort: Basel
Datum: 2018
Sprache: deutsch
Lizenz: CC BY-NC-ND 4.0
ISBN: 978-3-03897-106-1
Abmessungen: 17.0 x 24.4 cm
Seiten: 216
Schlagwörter: document image processing, preprocessing, binarizationl, text-line segmentation, handwriting recognition, indic/arabic/asian script, OCR, Video OCR, word spotting, retrieval, document datasets, performance evaluation, document annotation tools
Kategorie: Informatik