Seite - 126 - in Joint Austrian Computer Vision and Robotics Workshop 2020

Bild der Seite - 126 -

Text der Seite - 126 -

Figure 2. Sequence of recorded RGB images. The sequence starts in the top left with the full stack of objects and we recordanRGBimage after eachobject removal. We also record the correspondingdepth image foreveryRGBframe. proposalg isdefinedas g={x,y,θ,w,h}, (1) wherex and y describe the center of the grasp pro- posal, θ describes the angle of the rotated bounding box, andw and h describe the width and height of the predictedbox. Initialdepthsegmentation. Themainfocusofour algorithm is to detect depth changes in the scene af- tera successfulgraspwasperformedbyahumanex- pert. Therefore, we calculate the depth differenceI∗ of twoconsecutivedepth imagesas I∗= |I1−I2|, (2) whereI1 andI2 are thedepth imagespreviouslynor- malized between 0 to 255. The output I∗ is a rough estimate of the segmentation mask of the removed object. Segmentation mask refinement. The intermedi- ate segmentation iscoarseandcontainsnoisemainly duetoinaccuratesensorvaluesandsmallmovements of the objects. Therefore, further refinement of the segmentation mask is needed. We apply binary im- age morphology to remove the majority of noise and smooth themaskedges. AGaussianfilter is thenap- plied for further noise reduction and to create the re- finedmaskwhich isused for furtherprocessing. The Gaussianfiltergfilter isdefinedas gfilter(x,y,σ)= 1 2piσ2 e− x2+y2 2σ2 , (3) wherexandyare thespatialdimensionsof the inter- mediatemaskI∗, andσ isdefinedas thestandardde- viation for the Gaussian kernel. In our experiments, we setσ=1, which means that it is equal for both axes. Automatic grasping point annotation. The re- fined segmentation mask is then used to calculate geometric features of the object. The skeleton of the object mask is calculated by using [8] to remove boarder pixels as long as the connectivity does not break. Theresultingskeletonof theobject isapprox- imated with a line segment, which makes it more ro- bust to outliers. Each point on this line segment can thenbeusedasapossiblecenterofagraspproposal. The heighthand the rotation angleθof a grasp pro- posal is determined by calculating the intersection between a line, which is normal to the skeleton and passes through the center of a grasp proposal, and the edges of the mask. The bounding box widthw is directly dependent on the used gripper and we set this parameter manually to suit our robotic gripper. All this information are then combined and used to generate the final grasping proposals. The proposals havecertaincharacteristics: 1. The center of a bounding box is located at the spineof theobject. 2. The height of the bounding boxes are bounded to theedgesof theobjectmask. 3. The width of the bounding boxes can be set manually, because this parameter highly de- pendson thegrippercharacteristics. 4. The majority of the grasp proposals are gener- ated near the center of mass, which is based on the assumption that these points more likely lead toansuccessfulgrasp. Results. Our automatic annotation pipeline allows us togenerateahighnumberofgraspinglabelswith- out any supervision of human annotators. Further- more, due to the fact that the data is recorded while an expert did the grasping, we implicitly have super- vision about which object should be removed from 126

zurück zum Buch Joint Austrian Computer Vision and Robotics Workshop 2020"

Joint Austrian Computer Vision and Robotics Workshop 2020

Titel: Joint Austrian Computer Vision and Robotics Workshop 2020
Herausgeber: Graz University of Technology
Ort: Graz
Datum: 2020
Sprache: englisch
Lizenz: CC BY 4.0
ISBN: 978-3-85125-752-6
Abmessungen: 21.0 x 29.7 cm
Seiten: 188
Kategorien: Informatik; Technik