Page - 99 - in Proceedings - OAGM & ARW Joint Workshop 2016 on "Computer Vision and Robotics“
Image of the Page - 99 -
Text of the Page - 99 -
is used to predict the parameters of this motion. The advantage of using random forests is that it is a
collection of trees that learn and predict independently, even when some input data is affected due
to occlusions, other trees can still provide good predictions. In order to track objects in different
views, [18] trains a random forest for multiple views of the object that leads to a high
computational effort. Moreover, the approach is not suitable for tracking symmetrical objects as the
multiple-pose hypotheses are averaged and this leads to erroneous tracking of symmetrical objects.
An offline learning based approach with known 3D object models based on particle filters is
proposed in [9]. In [20], the authors propose a learning based approach inspired by [18] with
reduced computational cost and improved occlusion handling capability.
In the proposed approach, we make the following contributions: a) we argue that it is sufficient to
train only 6 random forests, to learn the relation between object motion and its corresponding
change in 3D point cloud data, which in turn reduces the computational complexity b) dealing with
symmetrical and non-symmetrical objects and c) a framework that is capable of tracking objects in
presence of partial occlusions. A quantitative comparison is also carried out in this paper that uses
synthetic data (that includes ground truth) provided by [5] to compare our approach against the
state of the art.
3. Method
This section illustrates the proposed approach for localizing and tracking 3D objects with high
performance and accuracy. First we describe the global localization algorithm RANGO, followed by
the local tracking algorithm. Then, we illustrate how both components are combined into the full
tracking framework
3.1. RANGO – RANdomized Global Object localization
RANGO is an algorithm for 3D object localization. It is based on a random sampling algorithm
(RANSAC) described in [1][3] with several performance and robustness improvements, allowing a
very fast detection rate when compared to the registration approach proposed in [7]. Its main
contribution is the replacement of K-nearest neighborhood search for inlier detection with a
probabilistic grid based approach. Thus the time complexity for the evaluation of a hypothesis
(acceptance function) is reduced from 𝑂 (𝑛 ∗ 𝑙 𝑜
𝑔 (𝑚 )) where 𝑛 is the number of model points, 𝑚
denotes the number of points in the scene, to 𝑂 (𝑛 ). Additionally, the evaluation of the number of
model points that fit the hypothesis is stopped early when the probability of finding a good match is
too low.
Sparse 3D Voxel Grid. Each 3D point of a scene is approximated into a sparse axis aligned 3D
grid. Each voxel of this grid is defined by a (𝑥 ,𝑦 ,𝑧 ) tuple where 𝑥 ,𝑦 ,𝑧 are (integer) coordinates for
the voxel location. In RANGO this (𝑥 ,𝑦 ,𝑧 ) position is hashed into a single 32bit number which is
used as an index in a hash table. Due to hashing collisions it is possible that two different points hash
to the same voxel even though their position is unrelated, but the probability is low enough that it is
not a problem for our use case. This 3D voxel grid is then used for fast verification of candidate
transformations.
To evaluate a transform matrix, we iterate over a set of sample points of the model and transform
them into the scene. Each sample point is hashed into the 3D voxel grid containing scene points. If
the hashed voxel is filled with a point and has a similar normal vector orientation as the model point,
we count that as an inlier. This verification method has a complexity of 𝑂 (𝑚 ) where 𝑚 is the
number of sample points. This verification is only approximate as it is possible to miss a neighboring
sampling point because we only lookup the voxel the sample point hashes to, ignoring neighboring
99
Proceedings
OAGM & ARW Joint Workshop 2016 on "Computer Vision and Robotics“
- Title
- Proceedings
- Subtitle
- OAGM & ARW Joint Workshop 2016 on "Computer Vision and Robotics“
- Authors
- Peter M. Roth
- Kurt Niel
- Publisher
- Verlag der Technischen Universität Graz
- Location
- Wels
- Date
- 2017
- Language
- English
- License
- CC BY 4.0
- ISBN
- 978-3-85125-527-0
- Size
- 21.0 x 29.7 cm
- Pages
- 248
- Keywords
- Tagungsband
- Categories
- International
- Tagungsbände