YOLO NAS: Object Detection Model


For object recognition AI Models such as YOLO-NAS and YOLO-NAS-SAT due to their impressive performance (accuracy) and speed during inference. These models are based on modern architectures and use advanced Python libraries for training and inference. By using Neural Architecture Search and other modern techniques, they can handle a variety of datasets and offer remarkable speed and high accuracy in object recognition.

From images of everyday objects to scientific use cases or recognizing structures in documents, everything is possible with YOLO NAS and YOLO-NAS-SAT.

What is an object detection model?

Object recognition, in English Object Detectionis a branch of image processing and artificial intelligence that deals with the identification and localization of objects in images. Identification is usually carried out using predefined classes such as "car", "bicycle" and "person". Localization is carried out using so-called Bounding Boxesthat surround the outlines of the objects and clearly delineate them in a picture. An example of this:

Yolo nas bounding boxes

Today, a model for automatic object recognition is usually based on an artificial neural network that is trained using examples. After successful training, the model recognizes similar objects in new images.

Yolo nas model

Note!: In the following, the English terms Object Detection and Object Detection Model as these are also predominantly used in German literature and the specialist field.

What are YOLO and YOLO NAS?

YOLO stands for "You Only Look Once" and describes a series of object detection models that have been developed for real-time object recognition. Due to the high speed of inference and simultaneously good accuracy, models from the YOLO series are popular for real-world applications, as good accuracy at high speed and resource efficiency are often particularly important here. The YOLO models achieve their high speed, among other things, by dispensing with a two-stage detection process, which is often the case with other models.

The YOLO-NAS variant (Neural Architecture Search) was developed by Deci.AI and is a further development of the well-known YOLO models. The model structure has been designed using an automated architecture search in order to be efficient for different hardware configurations and at the same time achieve a high level of accuracy. This makes YOLO-NAS a flexible and powerful model.

YOLO-NAS can be optimized via a Model quantization individual blocks to speed after training. This involves changing the data type of the weights from floating point numbers to 8-bit integers. This results in less memory required for the weights and the possibility of a more efficient calculation. This in turn has a direct influence on the speed of inference, which can be further increased. Since the model structure and training (Quantization Aware Training) of YOLO-NAS was developed with these quantizable blocks, quantization has little influence on accuracy, which is often not the case with other models.

YOLO-NAS Neural Architecture Search
Source: deci.ai

Main features of YOLO-NAS

Main features Yolo Nas Model

Efficiency - Excellent speed on a wide range of hardware platforms. For example, YOLO-NAS can be used specifically for mobile devices with Snapdragon processors optimized be

Accuracy - Enables precise object detection.
The YOLO NAS models are more precise than yolov7 and yolov8.

Flexibility - Applicable to a variety of tasks and data sets. YOLO NAS can be trained with images and objects from science and technology in the same way as with natural images and objects, e.g. everyday objects.

Pre-trained models

Deci.AI provides 3 variants with pre-trained weights:

  • YOLO-NAS S
  • YOLO-NAS M
  • YOLO-NAS L

These are a small version (S), a medium version (M) and a large version (L). The variant that best suits the application can be selected. The small model is preferable for applications with very high speed requirements and reduced accuracy, while the medium or large model is preferable for other applications with higher accuracy requirements.

The accuracy values are shown below in the form of Mean Average Precision ([email protected]:0.95) based on the Coco 2017 Validation Dataset and the latency as a measure of the speed for an image with 640×640 pixels on an Nvidia T4 GPU:

ModelmAPLatency time (ms)
S47.53.21
M51.555.85
L52.227.87
S INT-847.032.36
M INT-851.03.78
L INT-852.14.78
YOLO NAS
Source: Deci.AI / github.com

Application examples of YOLO NAS

YOLO-NAS is ideal for a wide range of productive applications. Here are some examples of use:

Smart City

Surveillance systems and traffic flow analysis benefit from the rapid detection of multiple objects in real time.

YOLO NAS Smart City

Production

Detection of products or machine parts in production plants with the option of detecting product faults.

Yolo nas production

Robotics

Environment and object detection for robots, e.g. for detecting objects that a robot should drive around or grip.

Yolo nas robotics

Science

Analysis of images from research and in the medical environment, e.g. for recognizing different cell types. 

Yolo nas science

Documents

Recognition of visual structures in documents, e.g. tables and figures.

Yolo Nas Document 1
Yolo Nas Document 2

Training and implementation

The Python library SuperGradients facilitates the training and implementation of YOLO-NAS. The Python library offers predefined models in the variants S, M and L as well as training and optimization pipelines that enable fast and uncomplicated implementation. This allows developers to quickly get started with their own dataset and benefit from the powerful models. The SuperGradients library is published under the Apache 2.0 license and can therefore be used for commercial applications with very few restrictions. 

Conclusion

The variants of the YOLO-NAS Object Detection Models are characterized by their outstanding accuracy and speed of inference. Thanks to modern architectures and the use of Python libraries, they can handle different types of datasets and deliver remarkable performance in object detection. The use of Neural Architecture Search enables flexible adaptation to different hardware configurations, making the models suitable for various use cases.

Key features include high efficiency, accuracy and flexibility. Depending on requirements, developers can choose between pre-trained model variants to achieve the best balance between speed and accuracy. YOLO-NAS offers advantages in applications such as smart cities, manufacturing, robotics and science.

With the help of the Python library SuperGradients, the training and implementation process can be facilitated, allowing developers to quickly integrate with their own data set.

"
"
Nico Engelmann Avatar

Latest articles