⏱️ Estimated reading time: 15 min

Introduction

Large Language Models (LLMs) are fundamentally static systems. Once training concludes, the model’s parameters are frozen, and it cannot adapt to new environments or learn from new experiences. This fundamental limitation became the central topic of a survey paper published on arXiv in July 2025 (arXiv:2507.21046), authored by 27 researchers across 51 pages.

This paper systematically defines the concept of Self-Evolving Agents and proposes a comprehensive framework to understand how AI systems can autonomously evolve. This document analyzes the core contents of this survey and examines implications for the path toward Artificial Super Intelligence (ASI).

Core Concepts: What Are Self-Evolving Agents?

Limitations of Static LLMs

Current LLMs face the following fundamental limitations:

Dimension Static LLM Self-Evolving Agent
Knowledge Fixed at training time Continuously updated
Adaptation Impossible without retraining Autonomous real-time adaptation
Environment Preset assumptions Dynamic environment response
Feedback Cannot incorporate Incorporated and learned
Goals Fixed Can be autonomously adjusted

Defining Self-Evolving Agents

Self-Evolving Agents are AI systems with the following three core capabilities:

  1. Adaptive Learning: Continuously improve performance through experience and feedback without human intervention
  2. Autonomous Evolution: Independently modify their own behavior, strategy, and even goal settings
  3. Multimodal Integration: Simultaneously process and learn from diverse information forms including text, images, audio, and video

Three-Dimensional Evolution Framework

The paper proposes a framework that decomposes the evolution of AI agents into three dimensions: What evolves, When it evolves, and How it evolves.

Dimension 1: What Evolves?

Evolution targets are divided into three levels:

Micro Level: Token-Level Adaptation

  • Real-time adjustment of next-token prediction probability distributions
  • Context-specific output optimization

Meso Level: Task-Level Learning

  • Learning efficient solution strategies for specific task types
  • Building structured knowledge for tool use and problem decomposition

Macro Level: Meta-Learning and Goal Adjustment

  • Learning “how to learn” itself
  • Autonomously adjusting or extending goal structures

Dimension 2: When Does It Evolve?

Intra-Test-Time Evolution

  • Optimization occurring while solving a single problem
  • Examples: Chain-of-Thought, Self-Reflection

Inter-Test-Time Evolution

  • Gradual improvement accumulated across multiple problem-solving sessions
  • Examples: experience replay, memory systems

Lifelong Evolution

  • Continuous learning and evolution over extended periods
  • Key challenge: balancing knowledge retention with new learning

Trigger Mechanisms

  • Performance-based: Triggered when performance falls below threshold
  • Novelty-based: Triggered when novel situations are detected
  • Schedule-based: Triggered at fixed intervals

Dimension 3: How Does It Evolve?

Learning Signal Types

Signal Type Methods Strengths
Scalar Rewards RL, PPO, GRPO Objective evaluation
Textual Feedback Critique, Self-reflection Rich feedback
Multi-agent Dynamics Debate, Competition Diverse perspectives

Update Mechanisms

  • Gradient-based methods: Direct parameter updates through backpropagation
  • Evolutionary algorithms: Population-based optimization
  • Reinforcement Learning: Reward-based behavioral optimization
  • Meta-learning: Learning to learn

Application Domains

Software Development

  • AgentCoder: Self-improving coding agent that enhances code quality through autonomous code review and rewriting cycles
  • SEW (Self-Evolving Workshop): Collaborative evolution system where multiple agents specialize in different programming tasks

Education

Self-evolving tutoring agents can model each student’s learning level and knowledge gaps, automatically adjusting difficulty and explanation style to create personalized learning experiences that were previously impossible.

Finance

  • QuantAgent: Quantitative trading agent that autonomously discovers and verifies new trading strategies
  • Continuously adapts to changing market conditions and improves investment strategies

Exploration

  • Voyager: Self-evolving exploration agent in Minecraft environments
  • Autonomously discovers new skills and expands the action space

Benchmarks and Evaluation

The paper surveys over 30 benchmarks covering various aspects of self-evolving agents:

Benchmark Evaluation Focus
DSBench Data science task solving
SWE-bench Software engineering tasks
LifelongAgentBench Long-term adaptation
MultiAgentBench Multi-agent collaboration

Evolution of Evaluation Metrics

Traditional AI evaluation is insufficient for Self-Evolving Agents. The following metrics are particularly important:

  • Adaptation Speed: How quickly does the agent adapt to new environments?
  • Generalization Capability: Can it apply learned knowledge to unseen tasks?
  • Stability: Does performance remain stable during the evolution process?
  • Efficiency: How much computational resource does evolution consume?

Future Directions

Personalized AI

AutoPal: A Self-Evolving Agent designed to deeply understand each individual user’s preferences, needs, and interaction style. It can:

  • Build a personalized knowledge graph for each user
  • Adaptively adjust communication style and depth of explanation
  • Proactively propose items of interest based on long-term memory

Generalization

To achieve true intelligence, the following capabilities are needed:

  • Cross-domain transfer: Applying knowledge from one domain to another
  • Compositional reasoning: Solving novel problems by combining known elements
  • Causal reasoning: Understanding cause-and-effect relationships rather than just correlations

Safety

As AI systems autonomously evolve, new safety challenges emerge:

  • Privacy Protection: What should be remembered and what should be forgotten?
  • Value Alignment: How to maintain human-compatible values during evolution
  • Controllability: How to ensure humans can intervene in the evolution process

Multi-Agent Ecosystems

Complex real-world problems are difficult for individual agents to solve. The paper introduces MDTeamGPT, a medical diagnosis support system where specialized agents with different expertise collaborate.

The Path to ASI

Defining ASI

The paper defines ASI (Artificial Super Intelligence) as a system that:

  1. Exceeds human-level performance in all cognitive domains
  2. Can learn autonomously and continuously
  3. Can set and modify its own goals
  4. Can solve novel problems in unseen situations

Four-Stage Roadmap

The paper proposes an evolutionary roadmap consisting of the following stages:

Stage 1: Domain-Specific Excellence

Self-Evolving Agents that outperform human experts in specific domains. Current advanced AI systems are already at this stage.

Stage 2: Cross-Domain Integration

Agents that integrate knowledge across multiple domains and solve complex interdisciplinary problems. This requires Meta-Learning capabilities and cross-domain transfer learning.

Stage 3: Autonomous Goal Setting

AI systems that can set their own goals and determine learning directions without human intervention. This stage poses the greatest safety challenges.

Stage 4: Collective Super Intelligence

Multiple AI agents collaborating and co-evolving to solve problems beyond the capability of any individual, potentially leading to emergent intelligence.

Social Impacts

Positive Impacts

  • Scientific Acceleration: Significant acceleration of research cycles through autonomous hypothesis generation and validation
  • Problem Solving: Finding solutions to complex global challenges such as climate change and disease
  • Personalization: Truly personalized education, healthcare, and services

Challenges

  • Control Problem: How to ensure AI systems evolve within boundaries humans desire
  • Inequality Issues: Risk of polarization between those with access to advanced AI and those without
  • Job Displacement: Automation risk for complex cognitive tasks

Research Landscape

The paper surveys major research groups worldwide:

  • KAIST: Adaptive learning and multi-agent systems
  • Seoul National University: Reinforcement learning and meta-learning
  • Yonsei University: Natural language processing and knowledge evolution
  • Naver: Practical AI service application research
  • Kakao: Personalized AI and recommendation system evolution
  • LG AI Research: Industrial AI autonomy research

Research Methodology

The survey adopts the following research methodology:

Comprehensive Benchmarks

Systematic evaluation through over 30 diverse benchmarks covering various aspects of self-evolving agents.

Longitudinal Studies

Long-term performance change tracking to distinguish true evolution from temporary performance improvements.

Open Source

Most major research is publicly shared to promote collaboration within the research community.

Ethics and Societal Responsibility

Transparency

Self-Evolving Agents must be able to explain their decision-making processes and evolution outcomes.

Fairness

The evolution process must not amplify biases based on race, gender, or other attributes.

Social Responsibility

There must be appropriate mechanisms to prevent harm to society from evolved AI systems.

Conclusion

Self-Evolving Agents represent a fundamental paradigm shift beyond mere technical advancement in AI. The three-dimensional framework (What, When, How) presented in the paper provides a comprehensive conceptual map for understanding how AI systems can autonomously evolve.

The four-stage roadmap from domain-specific excellence to collective super intelligence is ambitious, but each stage builds upon accumulated research and practical achievements. Particularly, current advanced systems are already approaching Stage 1 and partially Stage 2.

However, the core challenges remain: safety, controllability, and value alignment. The power of Self-Evolving Agents is enormous, but ensuring that power evolves in directions beneficial to humanity is the most important research question of our time.


References: