YOLO is an architecture and framework for computer vision. The framework enables the following tasks:
- Object detection
- Object tracking
- Instance segmentation
- Image classification
- Pose estimation
It is commonly used because of their speed and easy deployment. It is able to perform some if not all of these tasks at real time.
The single-pass approach enables the architecture to detect objects really fast, making it ideal for real time applications with live video.
COCO & Training
Ultralytics provides training capabilities that you can find on their github page, or even use their website to have a UI friendly environment to train the model.
It is pre-trained using the COCO dataset which contains around 330K images with 200K images being annotated for object detection, segmentation and captioning. It also has 80 Objects categories like bananas, birds, cars and persons.
Ultralytics provides other datasets to help train your own model easier.
Featured models
Ultralytics took over training models since YOLOv8, but other versions were made. You can see the featured models on their documentation.
Reference
YOLO (Ultralytics created YOLOV8 and later) is a model that does object detection, tracking, instance segmentation, image classification and pose estimation tasks.
— YOLO from Fleeting note YOLO Elixir