3.1. Presentation of the Algorithm
The goal of object recognition is to identify and localize all instances of objects from one or more known classes in images [
6]. These classes are defined according to the application’s objectives. Typically, there are a small number of objects in the image, but their location and scale may vary.
The object recognition algorithm implemented in the program model allows for the recognition of several object classes in real-time. At this stage, the program does not store information about the objects, meaning that the algorithm only determines the class of a given object and its characteristics. For experimental purposes, simple 2D shapes (classes) have been chosen, which can have various applications in robotics. The classes that can be recognized by the algorithm include: circles, triangles, rectangles, pentagons, and hexagons. The recognition is robust to rotation and scale changes of the objects.
When executing the program, it goes through the following steps:
Step 1: Choosing a source for images - usb camera, disk or other.
cout << "******************************" << "\n";
cout << "SELECT AN OPTION FROM THE MENU" << "\n";
cout << "******************************" << "\n";
cout << "1. Load from camera" << "\n";
cout << "2. Load from Shared Memory" << "\n";
cout << "3. Load from Picture" << "\n";
cout << "4. Load from Clipboard" << "\n";
cout << "5. Exit" << "\n";
cout << "Your choice: ";
cin >> choice;
Step 2: The Update function is called in the main program loop. The inRange function creates a binary image where the pixels that fall within the specified color range are white, while the rest are black. It takes the current frame, applies the specified color boundaries to it, and stores the resulting image in thresholded
void Update() {
inRange(frame, Scalar(lowB, lowG, lowR), Scalar(highB, highG, highR), thresholded);
//...
}
The
inRange function performs color-based segmentation using the lower and upper color boundaries, which are set interactively via a
trackbar (
Figure 3).
The
inRange function can work with both RGB (Red, Green, Blue) and HSV (Hue, Saturation, Value) images [
7]. HSV is more effective for color segmentation and is closer to how humans perceive colors, as the model mimics the human ability to distinguish colors [
8]. The HSV color model is less sensitive to changes in lighting, making it suitable for color segmentation. In the current implementation,
inRange uses the RGB model because it is intuitive and convenient for visualizing the results, facilitating color interpretation in the context of technical goals and experiments.
Step 3: Contours of objects with well-defined boundaries in the image are detected. The contour of an object plays a crucial role in areas such as semantic segmentation and image classification. Extracting contours is a difficult task, especially when the contour is incomplete or open [
9].
The OpenCV library provides the function findContours, which detects contours in the created binary image (thresholded).
void Update() {
// Segments the image based on a specified color range
inRange(frame, Scalar(lowB, lowG, lowR), Scalar(highB, highG, highR), thresholded);
vector<vector<Point>> contours; //Vector to store contours
// Finds the contours in the binarized image
findContours(thresholded, contours, RETR_EXTERNAL, CHAIN_APPROX_SIMPLE);
//...
}
Step 4: All found contours are traversed, and each contour is simplified using the approxPolyDP function. This function reduces the number of points on the contour to describe the shape with a minimal number of vertices while preserving the essential geometry. Based on the number of vertices of the contours, the class of the object is determined (for example, triangle, rectangle, pentagon, etc.). For each recognized object, the colorClassification function is called to determine the color of the object.
void Update() {
// Segments the image based on a specified color range
inRange(frame, Scalar(lowB, lowG, lowR), Scalar(highB, highG, highR), thresholded);
vector<vector<Point>> contours; //Vector to store contours
// Finds the contours in the binarized image
findContours(thresholded, contours, RETR_EXTERNAL, CHAIN_APPROX_SIMPLE);
//loop the contours
for (const auto& contour : contours) {
double epsilon = 0.02 * arcLength(contour, true);
vector<Point> approx;
approxPolyDP(contour, approx, epsilon, true); //approximation of contour
// Skip small or non-convex objects
if (fabs(contourArea(contour)) < 100 || !isContourConvex(approx))
continue;
if (approx.size() == 3) {
// triangle
colorClassification(/*Operating parameters are passed here/*) ; }
else if (approx.size() == 4) {
// rectangle
colorClassification(/*Operating parameters are passed here/*) ; }
else if (approx.size() == 5) {
// pentagon
colorClassification(/*Operating parameters are passed here/*) ; }
else if (approx.size() == 6) {
// hexagon
colorClassification(/*Operating parameters are passed here/*) ; }
else
{
// Detect and label circles
colorClassification(/*Operating parameters are passed here/*) ; }
}
//...}