RS2 algorithm

euler.fd.cvut.cz
The Road Sign Recognition System - RS²

Contents :

Project Abstract

Recently, there exists an extensive development in the field of on-board systems of smart vehicles. The system is usually called Driver Support System and it is designed to help a human driver in vehicle guidance by monitoring current traffic situation and providing him with a valuable information source. It is supposed to help a driver in avoiding the potentially risky situations within today heavy traffic. Because of vital role of visual information for the human driver, the traffic situation should be interpreted by the computer vision methods. The recognition of traffic devices like road signs or traffic lights is necessary for such interpretation. We present here the Road Sign Recognition System, which is designed to detect and classify ideogram-based road signs in images of traffic scenes acquired form the moving car. The system uses promising edge-based detection method and tree structure of statistical classifiers employing the apriori information. The localization of the area of interest in the image abridges considerably time needed for the sign detection.

The System Description

The Road Sign Recognition System (RS²) is designed to be a general framework for the recognition of ideogram-based road signs. Therefore, the sign detection and classification stages are separated. The system may be then decomposed into several basic blocks. The preprocessing block use the basic image processing to prepare the input image for further processing. The detection block searches for the geometrical shapes which could correspond to the road signs and the final decision about the road sign type (or rejection) is made in the classification block.

The RS² is intended for the use in the real-time environment of smart vehicle on-board system. Therefore, all algorithms must be fast and reliable enough to reach similar system responce times and error rates as human drivers do.

Because a lot of apriori information is known about the position of traffic signs aside the road, it may be utilized by the selection of the area of interest - the part of the image where signs occur with high probability. The time-consuming detection algorithm then process only small part of the traffic scene.

In the following paragraphs, we are going to explain all RS² subsystems briefly.

The Area of Interest

The road signs typically occur in well defined areas of the traffic scene. The determination of the area of interest, where detection algorithm is applied, may considerably speedup the whole system response time. On the other hand, there must be high probability of covering all road signs along the road to be an effective addition to the road sign recognition system. The area of interest is deduced from the information about road shape and curvature.

The determination of the road surface by the use of border white lines may suffer from low robustness in the case of paint deterioration or bad scene illumination. Therefore, we use rather texture-based road surface searching algorithm. The idea is that we may begin computing texture homogenity from the low central part of the image upwards in levels. Each level is explored from the center sideward to get the road surface boundary points. There must be used even the large amount of apriori knowledge to use these information for the area of interest construction.

In the present state, we are able to construct the coarse road surface shape from the border points. In the case of algorithm fail (e.g. because of crossroad or several vehicles ahead), the problem is also recognized and reported back to RS². In such a case, the exhaustive search of the traffic scene is performed and hence no signs should be missed.

In 1997, Pavel Paclik developed the prototype of the texture-based segmentation method under Matlab. The topic was further extended by Bohumil Kovar in his master thesis (1998).

The Road Sign Detection

The goal of the detection subsystem is to search the traffic scene image for the geometrical shapes which may correspond to the road signs. The algorithm input is the traffic scene image and the output the list of candidate regions. The HSFM (Hierarchical Spatial Feature Matching) algorithm is used for this purpose. It uses Sobel processes the gray-level input image with the Sobel operator and so extracts the edge information. The main idea (published e.g. by Seitz) is, that the local orientations (edges) have the crutial importance for the shape description (and hence detection).

Fig 2: The input image = the source for the shape detection (left) and the thresholded result of applying of two Sobel operators (right)
The Sobel operator is applied in horizontal and vertical direction and the edge magnitude is thresholded to reject small disturbances. Then, the result contains the edge information in all directions. To find a particular geometrical shape, the corresponding binary channels have to be extracted. The binary channel covers a predefined range of edge directions. When searching for given shape (e.g. diamond), there have to be employed lines of two orientations (see the following images).

Fig 3: Binary channels corresponding to the diamond edges.
The hierarchy of the method consists in the binding of found line segments by apriori information about corners (their positions and orientations). The binary channels are searched using templates which govern the shape and structure searched in the image.

The original HSFM ideas described in Szeitz articles have been revised and implemented in a very general way (C++ object model on PC) by Tomas Zikmund in his master thesis in 1996. The algorithm was tested on real images and found very promising for the real-time implementation. Thereafter, Vit Libal has rewritten the algorithm again in C and assembler of TMS320C80 DSP processor from Texas Instruments. The parallel environment has been capitalized in the implementation design. The number of adjustable parameters is optimized by employing Bayesian estimation methods. His PhD thesis is going to be published soon.
The result has become a robust tool for the detection of geometrical shapes described by templates (which may be rather complex).

The Road Sign Classification

The classification subsystem is called by the RS² process manager in order to evaluate image regions found by the detection subsystem. Each reagion may contain some learned road sign type or unknown object. Therefore, the classifier output is either sign code or rejection. There are only ideogram-based road signs taken into account which contain simple symbols or texts, during the classification - not the complex information or directional traffic signs.

The statistical classification is used based on the feature concept. The unknown image (pattern) is represented by several numerical characteristics (features) which are similar enough for the same class signs and different for others. Feature vectors are then considered to be points in so-called feature space. The occurence of pattern from particular class is then governed by the underlying probability density function (pdf). The task of classifying of unknown images then consists in estimating these pdfs from the training data and enumerating aposteriori probability using the Bayes rule.

In the case of road signs, several items should be stressed which are task-specific in this field of pattern recognition.

The road signs have been developed to be recognized error-free and quickly by the human driver. Therefore, the distinct color and shape combinations are used to minimize the missclassification risk.
There exist well-defined prototypes of the road signs in corresponding traffic devices standards. However, real road signs often differ from these etalons considerably (see following images of European warning signs "Children")
The road signs are 2D images of 3D objects acquired in changeable illumination conditions. The camera is mounted in moving car and therefore is exposed to vibrations. The image quality also suffers from the fast motion.

Fig 4. Differences between European warning signs "Children"
The classifier of the RS² is designed using following propositions :

The classifier must be able to learn large number of classes (several tens).
Real road signs often compose several clusters in the feature space - some kind of nonparametric method or mixture should be used because of required complexity.
The classification speed must be under some 50 ms (real-time requirements, longer response of the detection module requires fast classifier).
The learning algorithm should be fast enough (a vehicle may then learn new traffic signs directly on the road).
As much apriori knowledge about road signs should be employed into the classifier design, as possible.

The apriori information allows us to decompose the road sign recognition problem into several smaller ones and to solve them separately. The classifier is then rather decision tree with elementary classifiers at its nodes then one monolithical block. There are several advantages of such approach :

Faster system responce.
The system does not need to decide between all learned road signs - there are only subsets of similar signs are taken into account in each node classifier.
Better classifiecation results
The nodes in the decision tree (elementary classifiers) deal with smaller classes count then the alone classifier would be forced to. The result is smaller number of required features for the fine classification. Moreover, the impact of the curse of dimensionality is avoided (fast grow of computing power required for the dealing with increasing number of features = dimensions).
Partial results
There exist some classifier output even in the case of pattern rejection in lower levels of the decision tree. Then, the result obtained until the rejection is reported. The detection block often reports the road sign in the distance that does not allow accurate classifiecation because of the lack of image details. In such case, the sign is labeled at least with the coarse meaning. As the sign becomes closer, the classifier is able to assign more accurate type estimate.

There are user various features - spatial moment invariants, histogram features (entropy, energy), projection features (moments, entropy) etc. There are often used features reflecting particular road signs group exceptionality.

The nonparatric kernel classifier with Laplace (exponential) kernel is used. The kernel smoothing parameter (also called bandwidth) is selected which maximizes cross-validated log-likelihood function by the use of EM (Expectation-Maximization) algorithm.

The classification of road signs was the topic of master thesis of Pavel Paclik. The library of C++ classes and applications for the classifier design, testing and for image database management were developed (Win32, PC). The development continues in ANSI C under Linux, further on. The library of functions covering feature calculation, classifier design, feature selection and image processing of road sign images has emerged. It is intended for the real-time implementation of the road sign classifier on the DSP platform like TMS320C80 and TMS320C60.

There has been developed the knowledge base (features, feature selection) and tools for the rapid prototyping of the road sign classifiers.

webmaster

Last modified: Wed Feb 7 16:35:54 CET 1996