The Road Sign Recognition System - RS2
Recently, there exists an extensive development in the field of on-board systems
of smart vehicles. The system is usually called Driver Support System and it is designed to help a human driver
in vehicle guidance by monitoring current traffic situation and providing him with
a valuable information source. It is supposed to help a driver in avoiding the potentially risky situations
within today heavy traffic.
Because of vital role of visual information for the human driver, the traffic
situation should be interpreted by the computer vision methods.
The recognition of traffic devices like road signs or traffic lights is
necessary for such interpretation.
We present here the Road Sign Recognition
System, which is designed to detect and classify ideogram-based road signs
in images of traffic scenes acquired form the moving car. The system uses
promising edge-based detection method and tree structure of statistical
classifiers employing the apriori information. The localization of the area
of interest in the image abridges considerably time needed for the sign detection.
The System Description
The Road Sign Recognition System (RS2) is designed to be a general
framework for the recognition of ideogram-based road signs. Therefore, the sign detection
and classification stages are separated. The system may be then decomposed into several
basic blocks. The preprocessing block use the basic image processing to prepare
the input image for further processing. The detection block searches for the geometrical shapes
which could correspond to the road signs and the final decision about the road sign type
(or rejection) is made in the classification block.
The RS2 is intended for the use in the real-time environment of
smart vehicle on-board system. Therefore, all algorithms must be fast and reliable enough
to reach similar system responce times and error rates as human drivers do.
Because a lot of apriori information is known about the position of traffic signs aside
the road, it may be utilized by the selection of the area of interest - the part
of the image where signs occur with high probability. The time-consuming detection algorithm
then process only small part of the traffic scene.
In the following paragraphs, we are going to explain all RS2 subsystems briefly.
The Area of Interest
The road signs typically occur in well defined areas of the traffic scene. The determination
of the area of interest, where detection algorithm is applied, may considerably speedup
the whole system response time. On the other hand, there must be high probability
of covering all road signs along the road to be an effective addition to the road sign
recognition system. The area of interest is deduced from the information about road shape and curvature.
The determination of the road surface by the use of border white lines may suffer
from low robustness in the case of paint deterioration or bad scene illumination.
Therefore, we use rather texture-based road surface searching algorithm. The idea is
that we may begin computing texture homogenity from the low central part of the image
upwards in levels. Each level is explored from the center sideward to get the road surface
boundary points. There must be used even the large amount of apriori
knowledge to use these information for the area of interest construction.
In the present state, we are able to construct the coarse road surface shape from the
border points. In the case of algorithm fail (e.g. because of crossroad or several vehicles ahead),
the problem is also recognized and reported back to RS2. In such a case,
the exhaustive search of the traffic scene is performed and hence no signs should be missed.
In 1997, Pavel Paclik developed the prototype of the texture-based segmentation
method under Matlab. The topic was further extended by Bohumil Kovar in his
master thesis (1998).
The Road Sign Detection
The goal of the detection subsystem is to search the traffic scene image for the
geometrical shapes which may correspond to the road signs. The algorithm input is the traffic scene
image and the output the list of candidate regions. The HSFM (Hierarchical Spatial Feature
Matching) algorithm is used for this purpose. It uses Sobel processes the gray-level
input image with the Sobel operator and so extracts the edge information.
The main idea (published e.g. by Seitz) is, that the local orientations (edges)
have the crutial importance for the shape description (and hence detection).
Fig 2: The input image = the source for the shape detection (left) and
the thresholded result of applying of two Sobel operators (right)
The Sobel operator is applied in horizontal and vertical direction and the edge magnitude
is thresholded to reject small disturbances. Then, the result contains the edge information in
all directions. To find a particular geometrical shape, the corresponding
binary channels have to be extracted. The binary channel covers a predefined range of
edge directions. When searching for given shape (e.g. diamond), there have to be
employed lines of two orientations (see the following images).
Fig 3: Binary channels corresponding to the diamond edges.
The hierarchy of the method consists in the binding of found line segments
by apriori information about corners (their positions and orientations).
The binary channels are searched using templates which govern the shape
and structure searched in the image.
The original HSFM ideas described in Szeitz articles have been revised and
implemented in a very general way (C++ object model on PC) by Tomas Zikmund
in his master thesis in 1996. The algorithm was tested on real images and found
very promising for the real-time implementation. Thereafter, Vit Libal
has rewritten the algorithm again in C and assembler of
TMS320C80 DSP processor from Texas Instruments.
The parallel environment has been capitalized in the implementation design. The number
of adjustable parameters is optimized by employing Bayesian estimation methods.
His PhD thesis is going to be published soon.
The result has become a robust tool for the detection of geometrical shapes
described by templates (which may be rather complex).
The Road Sign Classification
The classification subsystem is called by the RS2 process manager
in order to evaluate image regions found by the detection subsystem. Each reagion may contain
some learned road sign type or unknown object. Therefore, the classifier
output is either sign code or rejection. There are only ideogram-based road signs
taken into account which contain simple symbols or texts, during the classification
- not the complex information or directional traffic signs.
classification is used based on the feature concept. The unknown image
(pattern) is represented by several numerical characteristics (features) which are
similar enough for the same class signs and different for others. Feature vectors
are then considered to be points in so-called feature space. The occurence
of pattern from particular class is then governed by the underlying probability
density function (pdf). The task of classifying of unknown images then consists in
estimating these pdfs from the training data and enumerating aposteriori
probability using the Bayes rule.
In the case of road signs, several items should be stressed which are task-specific
in this field of pattern recognition.
- The road signs have been developed to be recognized error-free and quickly by the
human driver. Therefore, the distinct color and shape combinations are used to minimize the
- There exist well-defined prototypes of the road signs in corresponding
traffic devices standards. However, real road signs often differ from these
etalons considerably (see following images of European warning signs "Children")
- The road signs are 2D images of 3D objects acquired in changeable
illumination conditions. The camera is mounted in moving car and therefore is
exposed to vibrations. The image quality also suffers from the fast motion.
Fig 4. Differences between European warning signs "Children"
The classifier of the RS2 is designed using following propositions :
The apriori information allows us to decompose the road sign recognition problem into
several smaller ones and to solve them separately. The classifier is then rather decision
tree with elementary classifiers at its nodes then one monolithical block.
There are several advantages of such approach :
- The classifier must be able to learn large number of classes (several tens).
- Real road signs often compose several clusters in the feature space - some kind
of nonparametric method or mixture should be used because of required complexity.
- The classification speed must be under some 50 ms (real-time requirements,
longer response of the detection module requires fast classifier).
- The learning algorithm should be fast enough (a vehicle may then learn
new traffic signs directly on the road).
- As much apriori knowledge about road signs should be employed into the classifier
design, as possible.
There are user various features - spatial moment invariants, histogram features (entropy,
energy), projection features (moments, entropy) etc. There are often used features
reflecting particular road signs group exceptionality.
- Faster system responce.
The system does not need to decide between all learned
road signs - there are only subsets of similar signs are taken into account in each node classifier.
- Better classifiecation results
The nodes in the decision tree (elementary
classifiers) deal with smaller classes count then the alone classifier would be forced to.
The result is smaller number of required features for the fine classification. Moreover,
the impact of the curse of dimensionality is avoided (fast grow of computing power
required for the dealing with increasing number of features = dimensions).
- Partial results
There exist some classifier output even in the case of
pattern rejection in lower levels of the decision tree. Then, the result obtained until the rejection
is reported. The detection block often reports the road sign in the distance that
does not allow accurate classifiecation because of the lack of image details. In such case,
the sign is labeled at least with the coarse meaning. As the sign becomes closer, the classifier
is able to assign more accurate type estimate.
The nonparatric kernel classifier with Laplace (exponential) kernel is used.
The kernel smoothing parameter (also called bandwidth) is selected which
maximizes cross-validated log-likelihood function by the use of EM
The classification of road signs was the topic of master thesis of Pavel Paclik.
The library of C++ classes and applications for the classifier design, testing
and for image database management were developed (Win32, PC).
The development continues in ANSI C under Linux, further on. The library
of functions covering feature calculation, classifier design, feature selection
and image processing of road sign images has emerged. It is intended for the
real-time implementation of the road sign classifier on the DSP platform
like TMS320C80 and TMS320C60.
There has been developed the knowledge base (features, feature selection) and
tools for the rapid prototyping of the road sign classifiers.
Last modified: Wed Feb 7 16:35:54 CET 1996