Intel’s latest AI research explains an alternative approach to train deep learning models for fast-paced real world use cases, across a variety of industries
Object detection is the term used to describe all techniques and methods for detecting, classifying, and identifying objects in an image. Recent advances in artificial intelligence have been made thanks to deep-learning and image processing. Now, it is possible to identify images or find objects within an image. Deep learning has made object detection very popular. Most of the methods that are available in the literature only adapt to images from the same domain.
CNNs have shown significant results for tasks specific to a domain, even though most architectures are optimized around well-known benchmarks. These domain-specific solutions, however, are usually well-tuned to a specific dataset. They begin with carefully selected architectures and training techniques. The disadvantage of this method is that it adapts the models unnecessarily to a specific dataset. Intel’s research team has developed a new strategy to address this problem. This is also the basis of the Intel(r), Geti(tm), platform. It includes a template that can be used for any dataset, and it is made up of pre-trained and carefully selected models.
The authors tested architectures of three types: lightweight, highly accurate, and medium. This was done to determine the scope of models that could be used for different object detection datasets, regardless of object complexity or size. Weights that have been pre-trained are used to achieve model convergence quickly, and start with high accuracy. A data augmentation operation can be performed to enhance images with random cropping, horizontal flipping, brightness and color distortions. Multiscale training has been applied to medium and accurate models in order to increase their robustness. To strike a balance between complexity and accuracy, the authors also empirically selected resolutions for each of their models after several trials. If a few training epochs do not improve the result, the authors also use early stopping and the ReduceOnPlateau adaptive scheduler to stop training.