Web-Books
in the Austria-Forum
Austria-Forum
Web-Books
Informatik
Joint Austrian Computer Vision and Robotics Workshop 2020
Page - 9 -
  • User
  • Version
    • full version
    • text only version
  • Language
    • Deutsch - German
    • English

Page - 9 - in Joint Austrian Computer Vision and Robotics Workshop 2020

Image of the Page - 9 -

Image of the Page - 9 - in Joint Austrian Computer Vision and Robotics Workshop 2020

Text of the Page - 9 -

ror (sum of squared distances), P3P based on [7] which requires only four of the six point pairs and EPnP mentioned earlier. All three variations were tried and tested. The estimated rotation and trans- lation vectors, after using Rodrigues to transform the rotation vector into the rotation matrixR, form Tcameralogo and thereforefinallyT target base,t . AsSiegwartet al. [31, p. 81ff] write, desired velocity can then eas- ily be generated using estimated parameters kρ and kα for a linear controller. The entire pipeline can be quickly summarized as follows: 1. Create the visual target with arbitrary logos. Physically measure the logo corners and their position in relation to Tlogo. Relate Tlogo to Ttarget and on the robotTcamera toTbase. 2. To avoid bias in data collection, implement a random-walk in logo vicinity but constrainθ to enable the camera to face the logo most of the time. Annotatebounding-boxcoordinatesofthe logos for select images. 3. Train theFaster-RCNNobjectdetectorwith this dataset. Fordocking, load themodelandobtain bounding-boxesusingROSimage-callbacks. 4. Use the inferred coordinates together with the measurements and intrinsic parameters of the camera in SolvePnP to obtain Ttargetbase,t at every timestep t. 5. Use a simple linear controller to generate ROS motioncontrolcommands toguide therobot to- wards thedocking target. 4.RESULTSANDDISCUSSION During inference, processing a single image within the ROS pipeline takes the detector approx- imately 35ms. On average the detection would re- sult in five bounding box proposals, sorting by con- fidence and extracting the top three boxes gives six image coordinates close to the ground thruth typi- cally within one to four pixels. Evaluating the mIoU gives 96.3% for thirteen test images. Object de- tection results are therefore both accurate and con- fident. The PnP-solver, the second major compo- nent of the framework, proved to be more trouble- some producing inaccurate results. All three im- plementations of the solvePnP algorithm express the translation vector ~tcameralogo using the right-hand co- ordinate systemKlogo. Preliminary results quickly showed that all methods are accurate in estimating y and z translation, but struggle with the x coordi- nate. To get a better understanding of the pose esti- mates, in particularly the estimated translation vec- tor, an extensive field study was conducted. The robot was steered towards six points and the ground truth translation and rotation were noted. These poses are described byKidx in Fig 2 where idx ∈ {dock,amcl,left close,left far,right close, right far}. At each point fifteen images were cap- tured, supplied to the Faster-RCNN model and the obtained image coordinates from bounding boxes, specifically six points per image, given to the PnP- solvers. After first results were analyzed, showing again large errors inx, changes were made in hopes of achieving more accurate pose results. In particu- lar, the followingmajorchangesweremade: 1. Theupper right logowasraisedfromtheplastic boardtoremovethecoplanarityofallsixpoints. By removing the coplanarity more information is available for estimating thecamerapose [9]. 2. SincesolvePnP,unlikeregularDLT,doesnotes- timate intrinsics, theyareapossiblecauseofer- ror. The camera was recalibrated and the new parameters used. The focal lengths and distor- tioncoefficientsdifferedslightly. 3. The autofocus of the camera was turned off. Captured images were still sharp and logos clearlyvisiblenonetheless. Afterwards, the same study was undertaken, captur- ing sequences of fifteen images at six locations, and using the detector followed by solvePnP to obtain cameraposeestimatesagain. The translationvectors were than saved and subsequently plotted to give a visual representation of the results. Figure 2 shows the results of this experiment in a 3D plot. The most notable thing here is the iterative algorithm flipped the signs for all three axis in almost all estimates. Its resultsare thereforepoint-symmetricalabout theori- gin,aknownissuewhenusingsolvePnP.Alsoofnote is that the large error in thexdirection still persists. This error occurs throughout all experiments and is not intuitive; the estimates inx are strangely placed. All points lie on the negative (right) half plane (with exception of some iterative estimates), but the es- timates for the locations with ground-truth in the 9
back to the  book Joint Austrian Computer Vision and Robotics Workshop 2020"
Joint Austrian Computer Vision and Robotics Workshop 2020
Title
Joint Austrian Computer Vision and Robotics Workshop 2020
Editor
Graz University of Technology
Location
Graz
Date
2020
Language
English
License
CC BY 4.0
ISBN
978-3-85125-752-6
Size
21.0 x 29.7 cm
Pages
188
Categories
Informatik
Technik
Web-Books
Library
Privacy
Imprint
Austria-Forum
Austria-Forum
Web-Books
Joint Austrian Computer Vision and Robotics Workshop 2020