Automatic Labelling: A Big Leap in Data Preparation for AI/ML Models


Tagging or labeling of data is an essential step in training computer vision models. With more and more data being needed for training, it is imperative to label the data in a hassle-free and less time-consuming fashion. This is where automatic labeling comes into the picture.

Automatic-Labelling-to-the-rescue

Challenge

When we started the computer vision model that can identify objects in an image and video, we never realized that the objects, we need to identify may take us over a year to label. We needed to label thousands of objects in millions of images to train our model. Innovation is the mother of necessity and we were forced to come up with options to automate our instrument labeling task.

We could not use solutions from companies such as Hive as those label objects in rectangle boxes and we needed to label exact boundaries of instruments.

Solution

We analyzed multiple tools which could help us in reducing the time and cost of labeling. The following are the tools for evaluation:

Tool  Pros  Cons 
Amazon Sagemaker Ground Truth Accurately labeled data can manage big data, competitive pricing ($8/100 objects) Need machine learning experience to carry out labeling jobs
Lionbridge AI Highly accurate labeled data, better project management features Higher pricing
V7 Darwin Speeds up labeling time dramatically Bugs are not managed
Label Opensource tool, user friendly  No project management features

 

Based on our selection criteria,3-4 seconds for labeling, we selected Label.

We needed to select a model which can help in automating the labeling of objects. Among the various options below, we selected Detectron2.

This approach allowed us to finish the labeling task in a week.

Results and Learning:

  • With the advent of Big Data, the future of  data labeling is active learning.
  • Data labeling requires quality control, manual intervention , and collaboration to produce high-quality training data.
  • The cost of data annotation was scaled down by 5 times.
  • Too many data points were created by automation, hence, another algorithm was created to reduce the number of data points.

Our Automatic Labelling to the rescue solution is available now to all our customers at no charge for the models which we are developing. In case you have any queries on how to auto-label the images, please contact us for more information at [email protected].

Visit our website at www.nextgeninvent.com

Stay In the Know

Get Latest updates and industry insights every month.