Web-Books
im Austria-Forum
Austria-Forum
Web-Books
Informatik
Document Image Processing
Seite - 34 -
  • Benutzer
  • Version
    • Vollversion
    • Textversion
  • Sprache
    • Deutsch
    • English - Englisch

Seite - 34 - in Document Image Processing

Bild der Seite - 34 -

Bild der Seite - 34 - in Document Image Processing

Text der Seite - 34 -

J. Imaging 2018,4, 80 Aspreviouslymentioned, in thepast, educatedpersons tookspecial carewhenwriting, resulting in ahighdegreeof stability in the slant of theirwriting style. Thus, inorder todetect the slant of the text inahistoricaldocumentpage,a fewfragmentsof textareconsidered.Althoughonesample couldtheoreticallybeenough, severalonesare inpracticenecessary toensurecoverageofpageswith sparse textorspecial formatting, suchascolumns,arrays,etc. To localizeappropriate fragments the followingwayis followed: apage isscannedfromleft toright, toptobottomusingawindowofsize HxW (heightXwidth), startingfromthepixelposition(skip, skip) inorder toskipscanningorother noise. Skipcanbegeneral e.g., 1/5ofdocumentwidth (here), orbedetermineddependingon the collection.Allblackpixels (black_pixels) inside thewindowarecounted. Thearea inside thewindowis retained ifCondition (3) is trueandthescanningstopswhenthe requirednumberMof fragments is localized. black_pixels HxW >R (3) TheCondition (3) requires the text in thewindow to takeupmore thanR=0.10 of the area. Thesizeof thewindowintheseexperiments,HxW,wasselectedasH=2mbandW=7mb,where mbis themaincharacterbodysize in thepage(heightof thecharacterbodyexcludingascendersand descenders). In thecurrentpaper, the followingmetricsandparametersaresetup: 1. The text ratioR in thewindow; 2. TheamountMof the fragments inuse; 3. TheheightHof thewindow; 4. ThewidthWof thewindow. However, thesametechniquesareconsideredfor: 1. Themainbodyheightdetection[22], since itdoesnot require lineorwordsegmentation; 2. Theslantdetectionprocedure. Once theMfragmentshavebeenselected (Figure5), the slant detectionalgorithm[9],describedinSection2, isappliedandtheslantanglesaredetected,one per fragment. Themaximumandminimumslantanglesare ignoredaspossibleoutliers,while slant isdefinedas thedetected slantof thepage. Theentiredocumentpage is thencorrected accordingto theslantanglebyshiftingeachpixel so that xf = x0+round [ y0 tan ( π slant 180 )] (4) yf = y0 (5) where (x0,y0)defines the initialpositionof thepixeland(xf,yf) is thefinalpixelposition. 34
zurück zum  Buch Document Image Processing"
Document Image Processing
Titel
Document Image Processing
Autoren
Ergina Kavallieratou
Laurence Likforman-Sulem
Herausgeber
MDPI
Ort
Basel
Datum
2018
Sprache
deutsch
Lizenz
CC BY-NC-ND 4.0
ISBN
978-3-03897-106-1
Abmessungen
17.0 x 24.4 cm
Seiten
216
Schlagwörter
document image processing, preprocessing, binarizationl, text-line segmentation, handwriting recognition, indic/arabic/asian script, OCR, Video OCR, word spotting, retrieval, document datasets, performance evaluation, document annotation tools
Kategorie
Informatik
Web-Books
Bibliothek
Datenschutz
Impressum
Austria-Forum
Austria-Forum
Web-Books
Document Image Processing