Meta AI Team Open-Sources Mephisto: A New Platform For Open And Collaborative Way Of Collecting Data To Train ML Models
It is important to have a variety of data sets for training new AI models. Many commonly used data sets are contaminated with labeling mistakes. It is difficult to develop robust models for new tasks, especially when there are labeling errors. To overcome these limitations, many researchers employ techniques like a variety data quality control procedures. There is no central repository of these strategies.
Researchers at Meta AI have released Mephisto. It’s a platform that allows you to collect, share and iterate the best approaches for collecting data sets to train AI models. Researchers can share unique collection strategies with Mephisto, in a format that is reusable and iterable. They can also change components and find the annotations quickly, which reduces the barriers to creating custom tasks.
The team identifies common paths for driving complex annotation activities from concept to data capture in Mephisto. Mephisto not only improves the quality of data sets, but also the experience of researchers and annotations who created them.