MIT research team announces 'SEAL', a framework that realizes 'self-learning AI', AI edits new information by itself, reinforces learning and becomes smarter



A research team at the Massachusetts Institute of Technology (MIT) has developed an AI self-learning framework called ' Self-Adapting Language Models (SEAL) '. When an AI model using SEAL encounters new information, it can edit the information itself, turn it into learning data, and apply reinforcement learning to itself.

Self-Adapting Language Models

https://jyopari.github.io/posts/seal

[2506.10943] Self-Adapting Language Models
https://arxiv.org/abs/2506.10943

When given new input, an AI model that uses SEAL constructs its own fine-tuning data and optimization instructions by reconstructing information in various ways, specifying optimization parameters, and performing generation involving data augmentation and gradient-based updates. This process is called 'Self-Edit,' and the AI model uses the self-edited results for its own reinforcement learning.



The reinforcement learning algorithm uses '

ReST^{EM} ' developed by Google Deepmind and others, and performs multiple patterns of reinforcement learning to leave the best performing one. In other words, when an AI model that applies SEAL encounters new knowledge, it self-learns and becomes even smarter.



The research team applied SEAL to the language model '

Qwen2.5-7B ' developed by Alibaba to investigate the degree of performance improvement through self-learning. The graph below shows the results of the investigation. The model with SEAL applied to Qwen2.5-7B (red line) is in the initial state. Although its performance was lower than GPT 4.1 (green line), after two rounds of self-learning it succeeded in exceeding GPT 4.1.



Although it seems that SEAL's self-learning can be repeated to improve performance infinitely, in an actual experimental environment, it was confirmed that repeating self-learning causes a phenomenon in which 'tasks that were previously executable become inexecutable.' The research team calls this phenomenon 'catastrophic forgetting' and points out the need for a mechanism to avoid forgetting and retain knowledge.

The code related to SEAL is available at the following link:

GitHub - Continual-Intelligence/SEAL: Self-Adapting Language Models
https://github.com/Continual-Intelligence/SEAL



in Software,   Science, Posted by log1o_hf