Page - 64 - in Proceedings of the OAGM&ARW Joint Workshop - Vision, Automation and Robotics

Image of the Page - 64 -

Text of the Page - 64 -

Fig. 4. View of the robot in front of machine through the light detection camera at the RoboCup 2016 in Leipzig, Germany. 1) AR Detection: To localize objects in a defined frame of reference, it is first necessary to localize the robot itself. For this, a laser scanner and the knowledge of the fixed outer boundaries of the factory are used to infer the position of the robot using an adaptive Monte Carlo localization approach [11]. Using the particle filter also the confidence of the current location can be inferred. With the known location of the robot and the known position of a camera mounted on the robot, it is possible to infer the position and orientation of seen augmented reality (AR) tags. For this, the open source AR-Tag tracking library Alvar is used. These tags are of defined size (allows derivation of distance to the tag) and are mounted at the input and the output of each machine. Each machine has a defined tag id for the input as well as the output. Using this knowledge, the position of the machine can be calculated having at least one of the tags seen. The accuracy and reliability of this measurements are further improved using a moving average filter. The filter is used to correct the estimate of the machine position with the help of several measurements. This raw data of the location of the machines is used by the higher layers as described in Section IV-B to determine which zone the machine is in. The information about the occupied zone is then reported to the referee box to earn points during the exploration. 2) Light Detection: To fully identify a machine, addi- tionally to the AR-tag, the position and orientation of it, as well as the shown light pattern, needs to be reported. The light pattern is used to uniquely identify the machine. For this, the robot moves to a point in front of the machine. Afterwards, the robot captures an image of the machine. The captured image can be seen in Figure 4. These views have random backgrounds with arbitrary components, colors, and structures in it. Therefore, a detection of the light with the help of a blob detection is difficult to configure and is unreliable. Instead, one can exploit the fixed structure of the traffic lights. All of them have the same geometry regardless Fig. 5. Cropped traffic light by the histogram of oriented gradients detector. of the shown light pattern. They have a defined ratio between length and height, are sectored in three parts and are always upright. This knowledge could be exploited by applying different manually generated and adjusted rules to determine the position of the traffic light in the image. Instead of these manuallycreated rules,ourapproachusesamachine learning approach allowing the method to be more reliable, easy to configure and adapt to new environments with no effort. We use the static feature of the structure to train a histogram of oriented gradients (HOG) detector as described by [12]. This detector exploits that the mentioned static features manifest in a static gradient pattern. Using the results of the HOG detector, a region of interest (ROI) can be extracted. The result of this cropping can be seen in Figure 5. Here the cropped traffic light is shown for all possible light combinations. The HOG detector has the advantage of almost no false-positive detections, i.e. if a ROI is found, there is a traffic light in it with a high probability. To report the type of shown light pattern, a mapping from the traffic light image (which light is on and which is off) to a representing number is needed. For this, the lighting condition is encoded in a binary fashion, i.e. the representing state is calculated as: state=s(green)0+s(yellow)1+s(red)2 (1) with s(x)= { 2, if x is on 0, else. (2) With this mapping, a feed forward artificial neural network canbe trained.Weused thescaledconjugategradientdescent algorithm described in [13] to train the network. With this trained network, it is possible to map a newly seen image to a vector of probabilities describing the likelihood of each class as described in [14]. This gathered information can then be used by the higher layers to build up a knowledge base about the environment as described in Section IV-B. The chain of a HOG-detector and a neural network was chosen as none of these approaches need a lot of computing power during the execution (only once at training time) to avoid the tuning of several parameters. The used neural network further increases the reliability as it can be trained to be resilient to different lighting situations. B. Scheduling Algorithm The robots have no information about their environment at the start of the game. Therefore, they have to use sensors 64

back to the book Proceedings of the OAGM&ARW Joint Workshop - Vision, Automation and Robotics"

Proceedings of the OAGM&ARW Joint Workshop Vision, Automation and Robotics

Title: Proceedings of the OAGM&ARW Joint Workshop
Subtitle: Vision, Automation and Robotics
Authors: Peter M. Roth; Markus Vincze; Wilfried Kubinger; Andreas Müller; Bernhard Blaschitz; Svorad Stolc
Publisher: Verlag der Technischen Universität Graz
Location: Wien
Date: 2017
Language: English
License: CC BY 4.0
ISBN: 978-3-85125-524-9
Size: 21.0 x 29.7 cm
Pages: 188
Keywords: Tagungsband
Categories: International; Tagungsbände

Page - 64 - in Proceedings of the OAGM&ARW Joint Workshop - Vision, Automation and Robotics

Image of the Page - 64 -

Text of the Page - 64 -

Table of contents