ARTIFICIAL INTELLIGENCE

AI Models Gain New Skills Without Forgetting Through Selective Retraining

A University of Illinois Urbana-Champaign study reveals selective retraining can help AI models learn new skills while preserving old ones, reducing costs and improving stability.

Read time: 5 min read
Word count: 1,147 words
Date: Oct 15, 2025

Summarize with AI

Researchers at the University of Illinois Urbana-Champaign have discovered that selectively retraining specific layers of large AI models, such as self-attention and upper MLP components, allows them to acquire new abilities without suffering from catastrophic forgetting. This innovative approach preserves existing knowledge while integrating new tasks, leading to significant reductions in retraining costs and enhanced model stability. The study tested this method on multimodal models, demonstrating that perceived knowledge loss is often a temporary output bias rather than true forgetting. These findings offer a more efficient path for developers to update complex AI systems, addressing a critical challenge in enterprise AI development.

An illustration of interconnected neural network nodes, representing advanced AI model architecture. Credit: Shutterstock

🌟 Non-members read here

A groundbreaking study from the University of Illinois Urbana-Champaign suggests that when large artificial intelligence models are fine-tuned, the observed loss of previously acquired skills may not indicate true forgetting. Instead, it could be attributed to a temporary bias in the model’s output, a discovery with significant implications for AI development and maintenance. By strategically retraining only specific layers within these complex models, such as the self-attention and upper Multilayer Perceptron (MLP) components, researchers found a way to impart new capabilities while meticulously safeguarding existing knowledge.

This targeted approach to retraining not only promises to significantly reduce the substantial costs associated with updating large AI models but also enhances their overall stability and reliability. The research team meticulously applied their method to various multimodal models, including LLaVA and Qwen2.5-VL, focusing on selective layer fine-tuning to meticulously quantify learning improvements, assess stability, and measure the extent of knowledge retention across a diverse array of tasks.

The insights gleaned from this study point toward a more efficient and sustainable paradigm for enterprises and developers. It offers a crucial pathway for updating extensive language and multimodal models without inadvertently compromising their established performance baselines. This distinction is particularly vital for enterprise AI teams, who frequently grapple with the complex challenge of model training without experiencing a decline in existing capabilities.

Refining AI Learning: Addressing the Challenge of Catastrophic Forgetting

Training advanced large multimodal models represents a monumental undertaking, often demanding millions of dollars and several weeks to complete. As these models and their corresponding datasets continue to grow exponentially in scale and complexity, the prospect of retraining them from the ground up becomes an increasingly daunting and resource-intensive endeavor. This challenge is compounded by the phenomenon known as “catastrophic forgetting,” a problem where a model, after being fine-tuned for a new, narrow task, tends to lose its proficiency in previously mastered skills.

“One option is to simply fine-tune the model on the new task,” the research team explained. “However, at least for simpler models, fine-tuning is known to cause catastrophic forgetting, such that a model previously proficient on many tasks becomes a narrow expert on the new one.” To thoroughly investigate whether this persistent issue affects today’s sophisticated large multimodal models, the researchers designed and executed a rigorous controlled evaluation. This experimental setup involved training the selected models on five specific target tasks, encompassing diverse applications such as detailed bird classification, precise object counting, medical visual question answering, optical character recognition (OCR) reading, and accurate time reading. Following this fine-tuning, the team meticulously measured the performance decline across eight distinct standard benchmarks that were deliberately excluded from the fine-tuning dataset.

These comprehensive experiments yielded two pivotal discoveries, as outlined in the research paper. First, the team found that by selectively tuning only the self-attention projection layers (SA Proj)—critical components responsible for enabling the model to determine which input elements warrant its focus—the models demonstrated an impressive ability to assimilate new tasks with minimal to no measurable forgetting of prior knowledge. The second significant finding revealed that what initially appeared as forgotten knowledge frequently reappeared and became accessible when the model was subsequently trained on another specialized task.

“We thus hypothesize that perhaps what looks like forgetting or interference after fine-tuning on a narrow target task is actually bias in the output distribution due to the task distribution shift,” the researchers elaborated. They further confirmed this hypothesis through an in-depth analysis conducted during the tuning of the counting task. The analysis showed that tuning the MLP layers indeed boosted target accuracy but concurrently heightened the likelihood of the model outputting numeric tokens, leading to a highly correlated drop in accuracy on unrelated, held-out tasks. Conversely, tuning only the self-attention layers achieved the desired learning for the new task without introducing a significant bias toward numeric tokens and crucially, without sacrificing accuracy on the held-out tasks.

The study’s findings underscore that the apparent performance degradation on previously mastered tasks following narrow fine-tuning is often transient. Performance that declines at an early stage can often be recovered later. The researchers meticulously traced this behavior to a measurable shift in the next-token distribution rather than a fundamental loss of conceptual understanding. A straightforward counting-bias probe was instrumental in making this distributional drift observable. Furthermore, a detailed layer-wise residual-to-logit analysis definitively showed that the majority of this observed shift was influenced by later MLP blocks within the model architecture, rather than by the self-attention mechanisms. These insights provide a nuanced understanding of how AI models adapt and retain information, paving the way for more robust and efficient training methodologies.

Strategic Retraining: Influencing Enterprise AI Development

Industry experts and analysts are closely observing these findings, anticipating that they will significantly reshape how enterprises approach the critical tasks of AI model maintenance and ongoing optimization. This novel research introduces a paradigm shift from extensive, full-model retraining to more focused, efficient updates.

“The research claims an innovative approach that could redefine enterprise developer practices, which can save cost and time as it introduces layer-specific retraining,” commented Faisal Kawoosa, founder and lead analyst at Techarc. He emphasized that this method directly addresses the pervasive issue of “catastrophic forgetting,” a long-standing impediment to efficient AI development. “The tuning of self-attention projection layers (SA Proj) has resulted in learning outcomes without any drop in performance,” Kawoosa added, highlighting the practical benefits of this selective approach.

Despite the promising nature of these initial findings, Kawoosa stressed the importance of further validation. He advocated for more extensive testing across a broader range of scenarios and diverse operational environments. Such comprehensive validation will be essential to definitively confirm the approach’s effectiveness, scalability, and robust performance within demanding enterprise settings. Enterprises often operate under unique constraints and with specific data types, necessitating thorough evaluation before widespread adoption.

Sanchit Vir Gogia, chief analyst and CEO at Greyhound Research, echoed similar sentiments, suggesting that the targeted retraining strategy proposed by the researchers has the potential to make AI model maintenance considerably less disruptive for technology teams. “Instead of giant retraining projects that eat up quarters and capital, updates can now happen quietly and often, more like servicing a car than rebuilding the engine,” Gogia articulated, drawing a vivid analogy that underscores the shift toward agile, incremental improvements.

However, Gogia also provided a cautionary note regarding the widespread adoption of partial retraining at scale. He emphasized that such an advanced methodology will necessitate more robust development processes and stringent governance frameworks within organizations. “Partial retraining only works when process catches up with promise,” Gogia asserted, highlighting the need for foundational support. He further advised that “enterprises will need proper scaffolding around this workflow, including version control, monitoring, and reproducibility, to make it sustainable at scale.” This means establishing clear protocols and tools to manage model versions, track performance, and ensure that retraining results can be consistently replicated, crucial elements for integrating this innovative approach into enterprise-level AI operations.