Web-Books
im Austria-Forum
Austria-Forum
Web-Books
Informatik
Document Image Processing
Seite - 4 -
  • Benutzer
  • Version
    • Vollversion
    • Textversion
  • Sprache
    • Deutsch
    • English - Englisch

Seite - 4 - in Document Image Processing

Bild der Seite - 4 -

Bild der Seite - 4 - in Document Image Processing

Text der Seite - 4 -

J. Imaging 2018,4, 68 manuscripts.Anexampleofbleed-throughremoval is showninFigure1. Earlier,physical restoration methodswereapplied todealwithbleed-throughdegradation,butunfortunately thosemethodswere costly, invasive,andsometimescausedpermanent, irreversibledamageto thedocuments. In recent years, digital preservation of the documental heritage has been the focus of intensivedigitisationandarchivingcampaigns,aimedat itsdistribution,accessibilityandanalysis. Withdigitizationprevailing, inadditiontoconservation, thecomputingtechnologiesappliedto the digital images of thesedocuments havequickly becomeapowerful andversatile tool to simplify their studyandretrieval, andto facilitatenewinsights into thedocument’scontents.Digital image processing techniquescanbeapplied to theseelectronicdocumentversions, toperformanyalteration tothedocumentappearance,whilepreservingtheoriginal intact. Specifically,digital imageprocessing techniqueshavebeenattemptedfor thevirtual restorationofdocumentsaffectedbybleed-through, withsomeimpressiveresults. Inaddition, to improvethedocumentreadability, theremovalof the bleed-thoughdegradation isalsoacriticalpreprocessingstep inmanytaskssuchas featureextraction, optical character recognition, segmentation,andautomatic transcription. Figure1.Anexampleofbleed-throughremoval. Bleed-through removal is a challenging taskmainly due to the possible significant overlap between the original text and the bleed-throughpattern, and thewidevariation of its extent and intensity. In literature, bleed-through removal is addressedas a classificationproblem,where the document image issubdividedinto threecomponents: background(thepapersupport), foreground (themaintext), andbleed-through [1]. Broadly speaking, the existingmethods in thisdomain can bedividedinto twomaincategories: blindorsingle-sided,andnon-blindordouble-sided. Inblind methods, the imageofasingleside isused,whereas thenon-blindmethodsrequire the information ofboththerectoandversosidesof thedocument.Mostof theearliermethodsrelyonthe intensity informationof the imageandperformrestorationbasedonthegrayscaleorcolor (red,green,blue) intensity distributions. The intensity basedmethods involve thresholding [3]; however, intensity informationalone is insufficientas there isoftenasignificantoverlapbetween the foregroundand bleed-throughintensityprofiles [4]. Inaddition, thresholdingmayalsodestroyotherusefuldocument features, suchas stamps, annotations, orpaperwatermarks. Thus, intensitybased thresholding is notsuitablewhentheaimis topreserve theoriginalappearanceof thedocument. Toovercomethese drawbacks, somemethods incorporatespatial informationbyexploiting theneighbouringstructure. Amongtheblindmethods, in [5], an independentcomponentanalysis (ICA)methodisproposed toseparate the foreground,background,andbleed-throughlayers fromanRGBimage.Adual-layer Markov randomfield (MRF) is suggested in [6],whereas, in [7], a conditional randomfield (CRF) methodisproposed.Amultichannelbasedblindbleed-throughremoval issuggestedin[8]usingcolor decorrelationorcolorspace transformations,whereas, in [9], a recursiveunsupervisedsegmentation approach isappliedto thedataspacefirstdecorrelatedbyprincipalcomponentanalysis (PCA). In [10], bleed-throughremoval isaddressedasablindsourceseparationproblem,solvedbyusingaMarkov randomfield(MRF)based local smoothnessmodel. Similarly,anexpectedmaximization(EM)-based approach issuggested in [11]. As per the non-blindmethods, amodel based approach using differences in the intensities of recto andverso side is outlined in [12]. The samemodel is extended in [13] using variational modelswithspatial smoothness in thewaveletdomain.Anon-blindICAmethodisoutlinedin[14]. Othermethodsof thiscategoryareproposedin[15–17]. Theperformanceof thenon-blindmethods 4
zurück zum  Buch Document Image Processing"
Document Image Processing
Titel
Document Image Processing
Autoren
Ergina Kavallieratou
Laurence Likforman-Sulem
Herausgeber
MDPI
Ort
Basel
Datum
2018
Sprache
deutsch
Lizenz
CC BY-NC-ND 4.0
ISBN
978-3-03897-106-1
Abmessungen
17.0 x 24.4 cm
Seiten
216
Schlagwörter
document image processing, preprocessing, binarizationl, text-line segmentation, handwriting recognition, indic/arabic/asian script, OCR, Video OCR, word spotting, retrieval, document datasets, performance evaluation, document annotation tools
Kategorie
Informatik
Web-Books
Bibliothek
Datenschutz
Impressum
Austria-Forum
Austria-Forum
Web-Books
Document Image Processing