Researchers at Stanford have developed an artificial intelligence (AI) approach called ‘MEND for fast model editing at scale’
The performance of large models has improved on a variety of computer vision problems, and in particular natural language processing. It is difficult to deploy and maintain such models if you have to issue patches after deployment. Due to the distributed nature the model’s models, it is difficult to make a localized change in order for the neural network to behave correctly when it produces an unwanted output. When prompted, a large language model that was trained in 2019 may assign a greater probability to Theresa May rather than Boris Johnson. What is the name of the Prime Minister in the United Kingdom (UK)?
The ideal editing procedure for a model would allow you to update model parameters quickly to increase the relative probability of Boris Johnson, while not altering the outputs from other inputs. This would produce edits that were reliable, changing the model’s output on the problematic input, e.g. Who is the prime minister of the United Kingdom. Locality: affecting the output of the model for unrelated inputs, such as What sports team plays Messi? Generality is the ability to generate the correct outputs for inputs that are related to the edited input (e.g. Who is the Prime Minster of the United Kingdom?) It is easy to make such edits by fine-tuning the example with a label. Fine-tuning a single example tends to lead to an overfitting, even if the distance between pre- and post-fine-tuning is small.
Overfitting can lead to both locality and genericity failures. Their experiments showed that while fine-tuning and training the edit set with the ongoing example improves locality but still requires more generality. It also requires continuous access to all training sets during testing, which is computationally more demanding. As an alternative, recent research has examined methods to learn how to edit models. Researchers present a meta-learning bi-level objective to determine a model initialization where standard fine-tuning of a single edit example results in valuable modifications.