Web-Books
in the Austria-Forum
Austria-Forum
Web-Books
Informatik
Joint Austrian Computer Vision and Robotics Workshop 2020
Page - 133 -
  • User
  • Version
    • full version
    • text only version
  • Language
    • Deutsch - German
    • English

Page - 133 - in Joint Austrian Computer Vision and Robotics Workshop 2020

Image of the Page - 133 -

Image of the Page - 133 - in Joint Austrian Computer Vision and Robotics Workshop 2020

Text of the Page - 133 -

jects like chains or cables. Most of the previous work on object detection focuses on detecting rigid objects[3, 15, 6, 2]. Our goal is to expand this re- search to deformable objects as well. We train an object detector based on CNN to detect both rigid and deformable objects. For this task a big amount of training images is required. Obtaining this data manually is time consuming, therefore we propose a method for synthesizing the training data which in- cludesanRGBbasedsegmentationprocedure that is able to handle deformable objects. We then use pub- licly available datasets as background for the syn- thetic images and augmentation techniques to in- crease the variabilityof thedataset. Figure 2. Illustration of the mask acquiring process. Top left image shows the original RGB image. Top right im- age shows the result of appyling k-means method to the original RGB image. Bottom left image shows the auto- maticallyselectedcontourandthearea insideof itcolored ingreen. Bottomright imageshowsthefinalextractedob- jectmasks. 3.1.Data acquisition Publicly available datasets which contain anno- tated objects are suitable for training CNN to detect object classes. However, when it comes to detecting specificobjects, a specializeddataset is required. We synthesize a dataset by capturing the images of the objects and develop a method to segment them from the flat surfaceon topofwhich theywereplaced. For the recording of objects a Kinect camera by Microsoft mounted on a tripod is used. The cam- era is placed at approximately 30 cm above the flat surface and facing the object at an angle of approx- imately 45 degrees. During the recording, both the camera and the flat surface are stationary. The flat surface should preferably be unicolor so that the ob- ject is clearlydistinguishable fromit. After the recording was initiated, the object was manipulated by hand in order to get it to face the camera from all possible viewing angles. The point is to get the object to face the camera in as many unique perspectives as possible. The advantage of this method is that it is able to capture deformable objects by simply changing their shape while they arebeing recorded. 3.2.Dataprocessing In order to synthesize images that are needed for trainingof thenetworkobjectmasksareneeded. Ob- taining the masks of the object is possible by man- ually segmenting the object from the background or byusingasegmentationmethod. Manuallysegment- ingobjects is inefficient, thereforewedeviseasimple method for object segmentation that is used for both rigid and deformable objects. For the segmentation of the object from the background a combination of computer-vision based methods is used. It contains the followingfivesteps: 1. Firstly, k-means clustering is applied to the im- age with the k value of 2. This method is suc- cessfulatdistinguishingtheboundariesof inter- est. Additionally it is computationally more ef- ficient thanapossiblealternativeofusingOtsu’s Thresholding. 2. After application of k-means, morphological operations like image closing and erosion are appliedtotheimageinorder toconnectpossible discontinuities in theborderof theobject. 3. Next, contour detection is applied to the whole image and locations of gravity centers of the area inside of the detected contours are deter- mined. Aredcircle isdrawnon the imagecom- ing from the Kinect camera, which is shown on the screen, in which the center of the object should be placed in order to automatically start thecapturingprocess. 4. The algorithm then determines if the contour satisfies conditions in terms of its length and distance from the center of the image and, if that is the case, the recording is started. Af- ter the capturing process is initiated a predeter- mined number of object projections is recorded at a regular time interval or per keyboard com- mand. The number of projections recorded is 133
back to the  book Joint Austrian Computer Vision and Robotics Workshop 2020"
Joint Austrian Computer Vision and Robotics Workshop 2020
Title
Joint Austrian Computer Vision and Robotics Workshop 2020
Editor
Graz University of Technology
Location
Graz
Date
2020
Language
English
License
CC BY 4.0
ISBN
978-3-85125-752-6
Size
21.0 x 29.7 cm
Pages
188
Categories
Informatik
Technik
Web-Books
Library
Privacy
Imprint
Austria-Forum
Austria-Forum
Web-Books
Joint Austrian Computer Vision and Robotics Workshop 2020