Seite - 126 - in Joint Austrian Computer Vision and Robotics Workshop 2020
Bild der Seite - 126 -
Text der Seite - 126 -
Figure 2. Sequence of recorded RGB images. The sequence starts in the top left with the full stack of objects and we
recordanRGBimage after eachobject removal. We also record the correspondingdepth image foreveryRGBframe.
proposalg isdefinedas
g={x,y,θ,w,h}, (1)
wherex and y describe the center of the grasp pro-
posal, θ describes the angle of the rotated bounding
box, andw and h describe the width and height of
the predictedbox.
Initialdepthsegmentation. Themainfocusofour
algorithm is to detect depth changes in the scene af-
tera successfulgraspwasperformedbyahumanex-
pert. Therefore, we calculate the depth differenceI∗
of twoconsecutivedepth imagesas
I∗= |I1−I2|, (2)
whereI1 andI2 are thedepth imagespreviouslynor-
malized between 0 to 255. The output I∗ is a rough
estimate of the segmentation mask of the removed
object.
Segmentation mask refinement. The intermedi-
ate segmentation iscoarseandcontainsnoisemainly
duetoinaccuratesensorvaluesandsmallmovements
of the objects. Therefore, further refinement of the
segmentation mask is needed. We apply binary im-
age morphology to remove the majority of noise and
smooth themaskedges. AGaussianfilter is thenap-
plied for further noise reduction and to create the re-
finedmaskwhich isused for furtherprocessing. The
Gaussianfiltergfilter isdefinedas
gfilter(x,y,σ)= 1
2piσ2 e− x2+y2
2σ2 , (3)
wherexandyare thespatialdimensionsof the inter-
mediatemaskI∗, andσ isdefinedas thestandardde-
viation for the Gaussian kernel. In our experiments,
we setσ=1, which means that it is equal for both
axes. Automatic grasping point annotation. The re-
fined segmentation mask is then used to calculate
geometric features of the object. The skeleton of
the object mask is calculated by using [8] to remove
boarder pixels as long as the connectivity does not
break. Theresultingskeletonof theobject isapprox-
imated with a line segment, which makes it more ro-
bust to outliers. Each point on this line segment can
thenbeusedasapossiblecenterofagraspproposal.
The heighthand the rotation angleθof a grasp pro-
posal is determined by calculating the intersection
between a line, which is normal to the skeleton and
passes through the center of a grasp proposal, and
the edges of the mask. The bounding box widthw
is directly dependent on the used gripper and we set
this parameter manually to suit our robotic gripper.
All this information are then combined and used to
generate the final grasping proposals. The proposals
havecertaincharacteristics:
1. The center of a bounding box is located at the
spineof theobject.
2. The height of the bounding boxes are bounded
to theedgesof theobjectmask.
3. The width of the bounding boxes can be set
manually, because this parameter highly de-
pendson thegrippercharacteristics.
4. The majority of the grasp proposals are gener-
ated near the center of mass, which is based
on the assumption that these points more likely
lead toansuccessfulgrasp.
Results. Our automatic annotation pipeline allows
us togenerateahighnumberofgraspinglabelswith-
out any supervision of human annotators. Further-
more, due to the fact that the data is recorded while
an expert did the grasping, we implicitly have super-
vision about which object should be removed from
126
Joint Austrian Computer Vision and Robotics Workshop 2020
- Titel
- Joint Austrian Computer Vision and Robotics Workshop 2020
- Herausgeber
- Graz University of Technology
- Ort
- Graz
- Datum
- 2020
- Sprache
- englisch
- Lizenz
- CC BY 4.0
- ISBN
- 978-3-85125-752-6
- Abmessungen
- 21.0 x 29.7 cm
- Seiten
- 188
- Kategorien
- Informatik
- Technik