Page - 64 - in Proceedings of the OAGM&ARW Joint Workshop - Vision, Automation and Robotics
Image of the Page - 64 -
Text of the Page - 64 -
Fig. 4. View of the robot in front of machine through the light detection
camera at the RoboCup 2016 in Leipzig, Germany.
1) AR Detection: To localize objects in a defined frame
of reference, it is first necessary to localize the robot itself.
For this, a laser scanner and the knowledge of the fixed outer
boundaries of the factory are used to infer the position of the
robot using an adaptive Monte Carlo localization approach
[11]. Using the particle filter also the confidence of the
current location can be inferred.
With the known location of the robot and the known
position of a camera mounted on the robot, it is possible
to infer the position and orientation of seen augmented
reality (AR) tags. For this, the open source AR-Tag tracking
library Alvar is used. These tags are of defined size (allows
derivation of distance to the tag) and are mounted at the
input and the output of each machine. Each machine has a
defined tag id for the input as well as the output. Using this
knowledge, the position of the machine can be calculated
having at least one of the tags seen. The accuracy and
reliability of this measurements are further improved using
a moving average filter. The filter is used to correct the
estimate of the machine position with the help of several
measurements. This raw data of the location of the machines
is used by the higher layers as described in Section IV-B to
determine which zone the machine is in. The information
about the occupied zone is then reported to the referee box
to earn points during the exploration.
2) Light Detection: To fully identify a machine, addi-
tionally to the AR-tag, the position and orientation of it,
as well as the shown light pattern, needs to be reported. The
light pattern is used to uniquely identify the machine. For
this, the robot moves to a point in front of the machine.
Afterwards, the robot captures an image of the machine.
The captured image can be seen in Figure 4. These views
have random backgrounds with arbitrary components, colors,
and structures in it. Therefore, a detection of the light with
the help of a blob detection is difficult to configure and is
unreliable. Instead, one can exploit the fixed structure of the
traffic lights. All of them have the same geometry regardless Fig. 5. Cropped traffic light by the histogram of oriented gradients detector.
of the shown light pattern. They have a defined ratio between
length and height, are sectored in three parts and are always
upright.
This knowledge could be exploited by applying different
manually generated and adjusted rules to determine the
position of the traffic light in the image. Instead of these
manuallycreated rules,ourapproachusesamachine learning
approach allowing the method to be more reliable, easy to
configure and adapt to new environments with no effort. We
use the static feature of the structure to train a histogram of
oriented gradients (HOG) detector as described by [12]. This
detector exploits that the mentioned static features manifest
in a static gradient pattern.
Using the results of the HOG detector, a region of interest
(ROI) can be extracted. The result of this cropping can be
seen in Figure 5. Here the cropped traffic light is shown for
all possible light combinations. The HOG detector has the
advantage of almost no false-positive detections, i.e. if a ROI
is found, there is a traffic light in it with a high probability.
To report the type of shown light pattern, a mapping from
the traffic light image (which light is on and which is off)
to a representing number is needed. For this, the lighting
condition is encoded in a binary fashion, i.e. the representing
state is calculated as:
state=s(green)0+s(yellow)1+s(red)2 (1)
with
s(x)= {
2, if x is on
0, else. (2)
With this mapping, a feed forward artificial neural network
canbe trained.Weused thescaledconjugategradientdescent
algorithm described in [13] to train the network. With this
trained network, it is possible to map a newly seen image
to a vector of probabilities describing the likelihood of each
class as described in [14].
This gathered information can then be used by the higher
layers to build up a knowledge base about the environment
as described in Section IV-B.
The chain of a HOG-detector and a neural network was
chosen as none of these approaches need a lot of computing
power during the execution (only once at training time)
to avoid the tuning of several parameters. The used neural
network further increases the reliability as it can be trained
to be resilient to different lighting situations.
B. Scheduling Algorithm
The robots have no information about their environment
at the start of the game. Therefore, they have to use sensors
64
Proceedings of the OAGM&ARW Joint Workshop
Vision, Automation and Robotics
- Title
- Proceedings of the OAGM&ARW Joint Workshop
- Subtitle
- Vision, Automation and Robotics
- Authors
- Peter M. Roth
- Markus Vincze
- Wilfried Kubinger
- Andreas Müller
- Bernhard Blaschitz
- Svorad Stolc
- Publisher
- Verlag der Technischen Universität Graz
- Location
- Wien
- Date
- 2017
- Language
- English
- License
- CC BY 4.0
- ISBN
- 978-3-85125-524-9
- Size
- 21.0 x 29.7 cm
- Pages
- 188
- Keywords
- Tagungsband
- Categories
- International
- Tagungsbände