Page - 134 - in Joint Austrian Computer Vision and Robotics Workshop 2020

Image of the Page - 134 -

Text of the Page - 134 -

40, which is usually more than enough to cap- ture theobject fromalldifferent angles. 5. If the object of interest is deformable the cap- turing process is paused, to capture a deformed state, and then re-started. Images of segmented objects are then stored on a hard drive for syn- thesizing trainingdata. Figure 2showstheillustrationofthemaskacquir- ing process. 3.3.Synthesizing trainingdata Figure 3. Examples of synthetic images that are used for training theYOLOv3network Inorder tosynthesize the training imagesweused a combination of Poisson image cloning [11] and pure pasting of the segmented objects onto differ- ent background images. As background for the syn- thetic images we used the Indoor Scene Recogni- tion dataset [12] and Describable Textures Dataset (DTD)[1]. Weusedtendifferentobjectsfor theeval- uation and generated 2500 synthetic images per ob- ject. To handle the blur that appears while the ob- jects are moving, we artificially blurred 20% of the images by adding horizontal motion blur between 5 and 15 pixels to the objects. As objects move closer or further away from the camera their relative size changes, so we introduce artificial scaling of the ob- jectuniformlydistributedbetween50%and125%of itsoriginalsize. Inordertotackletheocclusionprob- lem small patches of textures from the DTD dataset areplacedrandomlyon10%of thesynthetic images. These cover between 0% and 50% of the object sur- face. Additionally we introduce multiple objects to the image and allow them to occlude each other by a maximumIOU (IntersectionOver Union) of40%. Figure 3 shows the examples of the synthetic im- ages that areused for training theYOLOv3network. 4.Evaluation Toevaluatethemethod,wetrainedtheCNNbased object detector YOLOv3 using the synthetic images. A totalof tenobjectswereused,whichdiffergreatly in their shape and deformability. We know already thatYOLOperformsverywellwhenfacingrigidob- jects. Therefore our aim was to explore to what ex- tend the shape of an object can be deformed. As an example of rigid objects we use a can, two differ- ent tea boxes, and a lemon juice bottle. Slightly de- formableareheadphones, scissorsandahumanhand model. Extremely deformable objects that we used areearphones,powercableandapieceofchain. Properties of objects used for evaluation and their detectionprecisionarepresented inTable1. Figure 4. Successful and unsuccessful cases of detection ofdifferentdeformableobjects Figure 5.Chaindetection success cases Figure 6.Chaindetection failurecases In order to evaluate the precision of the proposed method two minute videos of each object being ma- nipulatedwerefilmedandevery20thframeextracted 134

back to the book Joint Austrian Computer Vision and Robotics Workshop 2020"

Joint Austrian Computer Vision and Robotics Workshop 2020

Title: Joint Austrian Computer Vision and Robotics Workshop 2020
Editor: Graz University of Technology
Location: Graz
Date: 2020
Language: English
License: CC BY 4.0
ISBN: 978-3-85125-752-6
Size: 21.0 x 29.7 cm
Pages: 188
Categories: Informatik; Technik