Seite - 134 - in Joint Austrian Computer Vision and Robotics Workshop 2020
Bild der Seite - 134 -
Text der Seite - 134 -
40, which is usually more than enough to cap-
ture theobject fromalldifferent angles.
5. If the object of interest is deformable the cap-
turing process is paused, to capture a deformed
state, and then re-started. Images of segmented
objects are then stored on a hard drive for syn-
thesizing trainingdata.
Figure 2showstheillustrationofthemaskacquir-
ing process.
3.3.Synthesizing trainingdata
Figure 3. Examples of synthetic images that are used for
training theYOLOv3network
Inorder tosynthesize the training imagesweused
a combination of Poisson image cloning [11] and
pure pasting of the segmented objects onto differ-
ent background images. As background for the syn-
thetic images we used the Indoor Scene Recogni-
tion dataset [12] and Describable Textures Dataset
(DTD)[1]. Weusedtendifferentobjectsfor theeval-
uation and generated 2500 synthetic images per ob-
ject. To handle the blur that appears while the ob-
jects are moving, we artificially blurred 20% of the
images by adding horizontal motion blur between 5
and 15 pixels to the objects. As objects move closer
or further away from the camera their relative size
changes, so we introduce artificial scaling of the ob-
jectuniformlydistributedbetween50%and125%of
itsoriginalsize. Inordertotackletheocclusionprob-
lem small patches of textures from the DTD dataset
areplacedrandomlyon10%of thesynthetic images.
These cover between 0% and 50% of the object sur-
face. Additionally we introduce multiple objects to
the image and allow them to occlude each other by a
maximumIOU (IntersectionOver Union) of40%.
Figure 3 shows the examples of the synthetic im-
ages that areused for training theYOLOv3network. 4.Evaluation
Toevaluatethemethod,wetrainedtheCNNbased
object detector YOLOv3 using the synthetic images.
A totalof tenobjectswereused,whichdiffergreatly
in their shape and deformability. We know already
thatYOLOperformsverywellwhenfacingrigidob-
jects. Therefore our aim was to explore to what ex-
tend the shape of an object can be deformed. As an
example of rigid objects we use a can, two differ-
ent tea boxes, and a lemon juice bottle. Slightly de-
formableareheadphones, scissorsandahumanhand
model. Extremely deformable objects that we used
areearphones,powercableandapieceofchain.
Properties of objects used for evaluation and their
detectionprecisionarepresented inTable1.
Figure 4. Successful and unsuccessful cases of detection
ofdifferentdeformableobjects
Figure 5.Chaindetection success cases
Figure 6.Chaindetection failurecases
In order to evaluate the precision of the proposed
method two minute videos of each object being ma-
nipulatedwerefilmedandevery20thframeextracted
134
Joint Austrian Computer Vision and Robotics Workshop 2020
- Titel
- Joint Austrian Computer Vision and Robotics Workshop 2020
- Herausgeber
- Graz University of Technology
- Ort
- Graz
- Datum
- 2020
- Sprache
- englisch
- Lizenz
- CC BY 4.0
- ISBN
- 978-3-85125-752-6
- Abmessungen
- 21.0 x 29.7 cm
- Seiten
- 188
- Kategorien
- Informatik
- Technik