Page - 9 - in Joint Austrian Computer Vision and Robotics Workshop 2020
Image of the Page - 9 -
Text of the Page - 9 -
ror (sum of squared distances), P3P based on [7]
which requires only four of the six point pairs and
EPnP mentioned earlier. All three variations were
tried and tested. The estimated rotation and trans-
lation vectors, after using Rodrigues to transform
the rotation vector into the rotation matrixR, form
Tcameralogo and thereforefinallyT
target
base,t . AsSiegwartet
al. [31, p. 81ff] write, desired velocity can then eas-
ily be generated using estimated parameters kρ and
kα for a linear controller.
The entire pipeline can be quickly summarized as
follows:
1. Create the visual target with arbitrary logos.
Physically measure the logo corners and their
position in relation to Tlogo. Relate Tlogo to
Ttarget and on the robotTcamera toTbase.
2. To avoid bias in data collection, implement a
random-walk in logo vicinity but constrainθ to
enable the camera to face the logo most of the
time. Annotatebounding-boxcoordinatesofthe
logos for select images.
3. Train theFaster-RCNNobjectdetectorwith this
dataset. Fordocking, load themodelandobtain
bounding-boxesusingROSimage-callbacks.
4. Use the inferred coordinates together with the
measurements and intrinsic parameters of the
camera in SolvePnP to obtain Ttargetbase,t at every
timestep t.
5. Use a simple linear controller to generate ROS
motioncontrolcommands toguide therobot to-
wards thedocking target.
4.RESULTSANDDISCUSSION
During inference, processing a single image
within the ROS pipeline takes the detector approx-
imately 35ms. On average the detection would re-
sult in five bounding box proposals, sorting by con-
fidence and extracting the top three boxes gives six
image coordinates close to the ground thruth typi-
cally within one to four pixels. Evaluating the mIoU
gives 96.3% for thirteen test images. Object de-
tection results are therefore both accurate and con-
fident. The PnP-solver, the second major compo-
nent of the framework, proved to be more trouble-
some producing inaccurate results. All three im-
plementations of the solvePnP algorithm express the translation vector ~tcameralogo using the right-hand co-
ordinate systemKlogo. Preliminary results quickly
showed that all methods are accurate in estimating
y and z translation, but struggle with the x coordi-
nate. To get a better understanding of the pose esti-
mates, in particularly the estimated translation vec-
tor, an extensive field study was conducted. The
robot was steered towards six points and the ground
truth translation and rotation were noted. These
poses are described byKidx in Fig 2 where idx ∈
{dock,amcl,left close,left far,right close,
right far}. At each point fifteen images were cap-
tured, supplied to the Faster-RCNN model and the
obtained image coordinates from bounding boxes,
specifically six points per image, given to the PnP-
solvers. After first results were analyzed, showing
again large errors inx, changes were made in hopes
of achieving more accurate pose results. In particu-
lar, the followingmajorchangesweremade:
1. Theupper right logowasraisedfromtheplastic
boardtoremovethecoplanarityofallsixpoints.
By removing the coplanarity more information
is available for estimating thecamerapose [9].
2. SincesolvePnP,unlikeregularDLT,doesnotes-
timate intrinsics, theyareapossiblecauseofer-
ror. The camera was recalibrated and the new
parameters used. The focal lengths and distor-
tioncoefficientsdifferedslightly.
3. The autofocus of the camera was turned off.
Captured images were still sharp and logos
clearlyvisiblenonetheless.
Afterwards, the same study was undertaken, captur-
ing sequences of fifteen images at six locations, and
using the detector followed by solvePnP to obtain
cameraposeestimatesagain. The translationvectors
were than saved and subsequently plotted to give a
visual representation of the results. Figure 2 shows
the results of this experiment in a 3D plot. The most
notable thing here is the iterative algorithm flipped
the signs for all three axis in almost all estimates. Its
resultsare thereforepoint-symmetricalabout theori-
gin,aknownissuewhenusingsolvePnP.Alsoofnote
is that the large error in thexdirection still persists.
This error occurs throughout all experiments and is
not intuitive; the estimates inx are strangely placed.
All points lie on the negative (right) half plane (with
exception of some iterative estimates), but the es-
timates for the locations with ground-truth in the
9
Joint Austrian Computer Vision and Robotics Workshop 2020
- Title
- Joint Austrian Computer Vision and Robotics Workshop 2020
- Editor
- Graz University of Technology
- Location
- Graz
- Date
- 2020
- Language
- English
- License
- CC BY 4.0
- ISBN
- 978-3-85125-752-6
- Size
- 21.0 x 29.7 cm
- Pages
- 188
- Categories
- Informatik
- Technik