Page - 14 - in Joint Austrian Computer Vision and Robotics Workshop 2020

Image of the Page - 14 -

Text of the Page - 14 -

I∈Rrowsxcolsxchannels (1) While research improved object recognition and classification with deep neural nets [15] , parallel ef- forts focused on template matching for object recog- nition [2][5]. Template matching uses extracted ex- ample images to find objects in new images. This method often involves sliding-window based algo- rithms [7], which find the template in a rectangular subpartof the image. Templatematchingworkswell for frontal images, but fails if the viewpoint differs from the actual template [4]. The simplicity of this technique still inspired new research, which is why its performance has improved significantly over the last10 years [6][11]. 2.1.PoseEstimation Buildingon topof the recognizedobject, it ispos- sible to estimate the pose of the object relative to the camera. This process is called pose estimation and it consists of three general categories. In the first cat- egory, the object’s pose is stored alongside its fea- ture vectors. Consequently, each different observed orientation representsa separatedetection,which re- sults inautomaticallyknowingtheobjectspose if the object ismatchedwithapreviously trainedone. The second category uses statistical techniques to align two given RGB-D images with each other. For this, Iterative Closest Point [3], or ICP, is the most commonly used algorithm and many variants exist fordifferent applications [12][14]. The third category tries to combine the pose esti- mation step with the recognition process itself. This makessense, since,asstatedearlier,adifferentview- pointcanchangetheappearanceofanobjectentirely. This category has been covered by recent research due to theemergingfieldofmachine learning [18]. Unfortunately, all of the before shown methods need either vast amounts of training data or an ac- curate model of the object that has to be detected. In this work, a different approach is taken. The prin- cipal component analysis (PCA) [1] is used for esti- mating the pose of a known object. PCA’s intended purpose is to extract principal components and re- ducedimensionalitybetweentheinputandtheoutput space. Using PCA to estimate the pose of the object, the needed input to the algorithm can be reduced to only the template. The proposed process of pose es- timation withPCAis shown in thenext chapter. Z X Y Transformation Camera Origin GraspingArea X Z Y Figure 1. Visualization of the grasping point. The figure shows the object that has to be grasped. The orange coor- dinate system shows thecenterof thegraspingarea. 3.METHODS The objective of the proposed approach is the es- timation of the pose of a known object. Before the pose of an object can be calculated, it first has to be located in an image. For this task, template match- ingwaschosendueto itseaseof implementationand use. After theobjecthasbeenrecognizedinthedepth image,principalcomponentanalysis isusedtodeter- mine the orientation of the found subpart of the im- age in3Dspace. Figure1shows the targetobjectof thiswork. The pose of the shape in the “grabbing area” has to be calculated so that it can be successfully grasped. For this, thenormalvectorof thesurface facing thecam- era has to be found. Through orientation of the vec- tors the rotational components of the 6D pose can be determined. This task can be solved by comput- ing the PCA for the points in the grabbing area. In this case, the principal component analysis yields 3 eigenvectorswith their respectiveeigenvalues for the given 3D points. As can be seen by studying Figure 1, 2 of the 3 dimensions of the shape in the grab- bingareadiffer fromtheother. Thespanofvalues in the X and Y direction are comparatively large in re- spect to the depth dimension Z. This also applies to the respective variances. Using prior knowledge, the normalvectorof theplaneparallel to thecameraori- gin (i.e. corresponding to the surface of the marked grabbing area) can be estimated using the eigenvec- 14

back to the book Joint Austrian Computer Vision and Robotics Workshop 2020"

Joint Austrian Computer Vision and Robotics Workshop 2020

Title: Joint Austrian Computer Vision and Robotics Workshop 2020
Editor: Graz University of Technology
Location: Graz
Date: 2020
Language: English
License: CC BY 4.0
ISBN: 978-3-85125-752-6
Size: 21.0 x 29.7 cm
Pages: 188
Categories: Informatik; Technik