What is Object Detection?
Object detection is an advanced form of image classification where a neural network predicts objects in an image and points them out in the form of bounding boxes.
Object detection thus refers to the detection and localization of objects in an image that belong to a predefined set of classes.
Tasks like detection, recognition, or localization find widespread applicability in real-world scenarios, making object detection (also referred to as object recognition) a very important subdomain of Computer Vision.
YOLO is a predictive technique that provides accurate results with minimal background errors. YOLO proposes the use of an end-to-end neural network that makes predictions of bounding boxes and class probabilities all at once. Following a fundamentally different approach to object detection, YOLO achieves state-of-the-art results beating other real-time object detection algorithms by a large margin.
Techniques of YOLO Algorithm-
- Residual Blocks- First, the image is divided into various grids. Each grid has a dimension of S x S. There are many grid cells of equal dimension. Every grid cell will detect objects that appear within them. For example, if an object center appears within a certain grid cell, then this cell will be responsible for detecting it.
- Bounding Box Regression- A bounding box is an outline that highlights an object in an image.
Every bounding box in the image consists of the following attributes:
- Class (for example, person, car, traffic light, etc.)
- Bounding box center
YOLO uses a single bounding box regression to predict the height, width, center, and class of objects. In the image above, represents the probability of an object appearing in the bounding box.
- Intersection over Union (IoU)- Intersection over union (IOU) is a phenomenon in object detection that describes how boxes overlap. YOLO uses IOU to provide an output box that surrounds the objects perfectly.
Each grid cell is responsible for predicting the bounding boxes and their confidence scores. The IOU is equal to 1 if the predicted bounding box is the same as the real box. This mechanism eliminates bounding boxes that are not equal to the real box.
YOLO or Other detectors?
Additional accuracy in predictions and an improved IoU (Intersection over Union) in bounding boxes (compared to real-time object detectors), the advantage of speed is inherited in YOLO.
YOLO is a much faster algorithm than its counterparts, running at as high as 45 FPS.
Algorithms like Faster RCNN work by detecting possible regions of interest using the Region Proposal Network and then perform recognition on those regions separately, YOLO performs all of its predictions with the help of a single fully connected layer.
Methods that use Region Proposal Networks thus end up performing multiple iterations for the same image, while YOLO gets away with a single iteration.
In YOLO, actually the IoU (intersection over union) is used twice:
- During training to compare the ground truth box to the predicted box.
- During the usage of the already trained YOLO network this technique is being used to eliminate overlapping boxes which include the same object many times.
YOLO in a nutshell: Key Takeaways
YOLO provided a super fast and accurate object detection algorithm that revolutionized computer vision research related to object detection.
YOLO has large-scale applicability with thousands of use cases, particularly for autonomous driving, vehicle detection, and intelligent video analytics.