Seite - 110 - in Proceedings - OAGM & ARW Joint Workshop 2016 on "Computer Vision and Robotics“

Bild der Seite - 110 -

Text der Seite - 110 -

Passes Additions Comparisons Ref 4 12n ·m 8n ·m Opt 2 5n ·m+2m 6n ·m+4m Table1. Comparing thenumberofoperations for thereferenceand theoptimized version. factor is storage, given the fact that both versions require additional storage ofn ·m. But the base version writes four times to this area, whereas the optimized version only once. If this storage area is accessible to the user, calculating the discrepancy norm in 2D always yields the according integral image for free. 3. Parallelization The proposed algorithm seems to be well-suited for parallelization methods. Computing and com- paring theothercomponentsof thediscrepancynormishighly independent. When it comes toparal- lelization,moderncomputersoffervariousoptions. Acommonclassification in this areacomes from [15]. The classification is based on the number of parallel instruction and data streams. A traditional processorbelongs toSISD,whereasmulti coreormultiprocessor systemsareMIMD.Instructionset extensions like SSE andAVX,also referred toasvectorunits, belong toSIMD. Asimilaritymeasure like the discrepancy normwill normallybeapplied many times. Pattern match- ingrequiresevaluating thediscrepancynormatmanydifferentpositionsofapatch. Therefore,SIMD is a promising approach. It is especially suitable for applying the same kind of operation to several data values at once. Furthermore, SIMD means choosing certain special instructions. At runtime, they do not have any overhead, compared to normal SISD instructions. On the other hand, making use of multiprocessing would lead to an overhead due to the fact that it involves spanning threads, distributing data and synchronizing at the end. As shown by [16], using multi core processors is complex. On the one hand, the work succeeded in using multiple cores to improve performance. On the other, hand the processor topology has an impact. The authors had to bind the threads to cores sharing thesameL2cache inorder to improveperformance. Not fulfilling this requirement results in a significant performancepenalty. SIMD instructions operate on a dataset or a so called vector. For example, a traditional add would perform a := a+ b. The SIMD version of this instruction would perform the same operation, but awould be a vector. Typical vector sizes of SIMD units range from 2 to 8 elements. Normally, vector units have registers of a fixed size. Depending on the size of the data type, they can process a certain amount of elements in one step. Vector units are not designed to operate horizontally, whichwouldmeancombiningelementswithinavector register. Wewill concentrateon thecommon SIMD extensions for the x86/x64 architecture. There are two extensions in this area: SSE and AVX - both exist in different versions, with each new version extending the previous one by adding new computingcapabilities [17]. AVX doubled the vector size comparesd to SSE. Yet, in terms of data shuffling, the situation became much more complex: with vector registers and operations split into lanes, one AVX register consists of two 128-bit lanes which simplified implementing the architecture for the designers. It does not make any difference for vector operations like additions. Nevertheless, for instance, the SSE shuffle operation takes an immediate value that allows indexing of up to four elements. The according AVX 110

zurück zum Buch Proceedings - OAGM & ARW Joint Workshop 2016 on "Computer Vision and Robotics“"

Proceedings OAGM & ARW Joint Workshop 2016 on "Computer Vision and Robotics“

Titel: Proceedings
Untertitel: OAGM & ARW Joint Workshop 2016 on "Computer Vision and Robotics“
Autoren: Peter M. Roth; Kurt Niel
Verlag: Verlag der Technischen Universität Graz
Ort: Wels
Datum: 2017
Sprache: englisch
Lizenz: CC BY 4.0
ISBN: 978-3-85125-527-0
Abmessungen: 21.0 x 29.7 cm
Seiten: 248
Schlagwörter: Tagungsband
Kategorien: International; Tagungsbände

Seite - 110 - in Proceedings - OAGM & ARW Joint Workshop 2016 on "Computer Vision and Robotics“

Bild der Seite - 110 -

Text der Seite - 110 -

Inhaltsverzeichnis