Research
Wentao Zhang’s research on LLM Agents, Self-Evolving Agents, and Financial AI.
Research
Research Themes
- Self-Evolving Agents — Protocols and architectures enabling agents to evolve their own prompts, tools, memory, and sub-agents autonomously
- LLM Agents & Multi-Agent Orchestration — Hierarchical frameworks, tool use, long-horizon planning, and standardized agent communication protocols
- General Computer Control — Foundation agents that operate arbitrary software using only pixels and natural language
- AI4Finance — End-to-end financial platforms, algorithmic trading, portfolio management, and HFT via RL and LLMs
- Reinforcement Learning — Deep RL for sequential decision-making in complex, partially observable environments
Featured Projects
Autogenesis: A Self-Evolving Agent Protocol
Autogenesis addresses a fundamental limitation of current LLM agent systems: they are static — prompts, tools, and behaviors fixed at design time cannot improve from experience.
Two tightly coupled layers power the system:
- Resource Substrate Protocol Layer — models prompts, agents, tools, and memory as versioned resources with explicit lifecycles, enabling safe mutation and rollback
- Self-Evolution Protocol Layer — a closed-loop system where the agent monitors its own performance, identifies failure modes, and autonomously rewrites its own resources to improve
The result is an agent that gets measurably better at complex planning and tool-use tasks through runtime self-modification — without human intervention.
Cradle: Empowering Foundation Agents towards General Computer Control
How do you build an agent that can use any computer software — without task-specific APIs or hand-coded integrations?
Cradle's answer: treat the screen as the universal interface. Agents receive screenshots as input and produce keyboard and mouse actions as output — exactly how humans interact with computers. This unlocks operation across arbitrary software: games (Red Dead Redemption 2, Stardew Valley, Cities: Skylines), browsers, email clients, and creative tools — all with the same agent.
Cradle also incorporates self-improvement: agents curate and refine a skill library from past experience, enabling progressive capability growth on new tasks.
⭐ 2.5k GitHub stars
AgentOrchestra: Hierarchical Multi-Agent Orchestration with the TEA Protocol
The Tool-Environment-Agent (TEA) protocol treats environments, agents, and tools as first-class resources with explicit lifecycles and versioned interfaces — solving the fragile, ad-hoc wiring that plagues most multi-agent systems.
AgentOrchestra builds on TEA with a central planner that dynamically spawns and coordinates specialist sub-agents (web navigation, data analysis, file operations). It achieves 89.04% on GAIA, establishing state-of-the-art performance on general-purpose agent benchmarks.
FinAgent: A Multimodal Foundation Agent for Financial Trading: Tool-Augmented, Diversified, and Generalist
A tool-augmented, diversified, and generalist multimodal agent for financial trading that integrates heterogeneous data sources (price, news, filings) and diverse trading tools, achieving state-of-the-art results across multiple financial benchmarks.
AlphaForgeBench: Benchmarking LLMs as Quantitative Researchers
Rather than asking LLMs to emit trading actions directly — which suffers from extreme run-to-run variance and irrational reversals — AlphaForgeBench repositions LLMs as quantitative researchers that generate executable alpha factors and strategy code, evaluated via standardized backtesting across 7 assets and 6 frontier models.
A 3×3 level-grade taxonomy of 903 queries (633 real-world + 270 augmented) reveals three persistent model archetypes and systematic difficulty scaling — providing a stable, reproducible foundation for LLM financial capability evaluation.
PolyMonitor: Prediction Market Intelligence Workspace
An open-source live intelligence workspace for Polymarket — consolidating market prices, on-chain flow, oracle activity, order-book depth, and macro context into a unified dashboard. Paired with the Polymarket Agent, a 10-node multi-agent forecasting pipeline combining deterministic evidence construction, LLM specialist agents, adversarial critique, and calibration.
The platform serves as the operational infrastructure for our Unlocking the Forecasting Economy dataset suite, covering the full prediction market lifecycle from listing through oracle resolution and settlement.
FinWorld: End-to-End Financial AI Platform
An all-in-one open-source platform for financial AI research and deployment, integrating data pipelines, model training, backtesting, and live deployment over 800M+ multimodal data samples from 1995–2025. Lowers the barrier for rigorous, reproducible AI4Finance research.
TradeMaster: A Holistic Quantitative Trading Platform Empowered by Reinforcement Learning
A holistic platform covering data processing, environment simulation, RL agent training, and performance evaluation across multiple financial markets and trading tasks.
⭐ 2.7k GitHub stars
EarnHFT: Efficient Hierarchical Reinforcement Learning for High-Frequency Trading
Decomposes the HFT problem into macro-level strategy selection and micro-level order execution. The hierarchical structure yields significantly improved sample efficiency and live-trading profitability.
EarnMore: Portfolio Management in Customizable Stock Pools
A maskable stock representation framework that enables RL agents to handle arbitrary stock universes with a single trained model — eliminating the need to retrain per pool.
TWOSOME: True Knowledge Comes from Practice: Aligning LLMs with Embodied Environments via Reinforcement Learning
Aligns LLMs with interactive environments through reinforcement learning, enabling agents to acquire genuine knowledge through embodied practice rather than passive pretraining.