Research

Wentao Zhang’s research on LLM Agents, Self-Evolving Agents, and Financial AI.

Research

My research sits at the intersection of LLM-powered autonomous agents and Financial AI (AI4Finance). A central theme is agent self-evolution — building systems that continuously improve themselves through closed-loop experience, resource versioning, and protocol-level self-modification.

Research Themes

  • Self-Evolving Agents — Protocols and architectures enabling agents to evolve their own prompts, tools, memory, and sub-agents autonomously
  • LLM Agents & Multi-Agent Orchestration — Hierarchical frameworks, tool use, long-horizon planning, and standardized agent communication protocols
  • General Computer Control — Foundation agents that operate arbitrary software using only pixels and natural language
  • AI4Finance — End-to-end financial platforms, algorithmic trading, portfolio management, and HFT via RL and LLMs
  • Reinforcement Learning — Deep RL for sequential decision-making in complex, partially observable environments

Multi-Agent · GAIA SOTA

AgentOrchestra: Hierarchical Multi-Agent Orchestration with the TEA Protocol

The Tool-Environment-Agent (TEA) protocol treats environments, agents, and tools as first-class resources with explicit lifecycles and versioned interfaces — solving the fragile, ad-hoc wiring that plagues most multi-agent systems.

AgentOrchestra builds on TEA with a central planner that dynamically spawns and coordinates specialist sub-agents (web navigation, data analysis, file operations). It achieves 89.04% on GAIA, establishing state-of-the-art performance on general-purpose agent benchmarks.

KDD 2024

FinAgent: A Multimodal Foundation Agent for Financial Trading: Tool-Augmented, Diversified, and Generalist

A tool-augmented, diversified, and generalist multimodal agent for financial trading that integrates heterogeneous data sources (price, news, filings) and diverse trading tools, achieving state-of-the-art results across multiple financial benchmarks.

KDD 2026

AlphaForgeBench: Benchmarking LLMs as Quantitative Researchers

Rather than asking LLMs to emit trading actions directly — which suffers from extreme run-to-run variance and irrational reversals — AlphaForgeBench repositions LLMs as quantitative researchers that generate executable alpha factors and strategy code, evaluated via standardized backtesting across 7 assets and 6 frontier models.

A 3×3 level-grade taxonomy of 903 queries (633 real-world + 270 augmented) reveals three persistent model archetypes and systematic difficulty scaling — providing a stable, reproducible foundation for LLM financial capability evaluation.

AI4Finance · Prediction Markets

PolyMonitor: Prediction Market Intelligence Workspace

An open-source live intelligence workspace for Polymarket — consolidating market prices, on-chain flow, oracle activity, order-book depth, and macro context into a unified dashboard. Paired with the Polymarket Agent, a 10-node multi-agent forecasting pipeline combining deterministic evidence construction, LLM specialist agents, adversarial critique, and calibration.

The platform serves as the operational infrastructure for our Unlocking the Forecasting Economy dataset suite, covering the full prediction market lifecycle from listing through oracle resolution and settlement.

KDD 2026

FinWorld: End-to-End Financial AI Platform

An all-in-one open-source platform for financial AI research and deployment, integrating data pipelines, model training, backtesting, and live deployment over 800M+ multimodal data samples from 1995–2025. Lowers the barrier for rigorous, reproducible AI4Finance research.

NeurIPS 2023

TradeMaster: A Holistic Quantitative Trading Platform Empowered by Reinforcement Learning

A holistic platform covering data processing, environment simulation, RL agent training, and performance evaluation across multiple financial markets and trading tasks.

⭐ 2.7k GitHub stars

AAAI 2024

EarnHFT: Efficient Hierarchical Reinforcement Learning for High-Frequency Trading

Decomposes the HFT problem into macro-level strategy selection and micro-level order execution. The hierarchical structure yields significantly improved sample efficiency and live-trading profitability.

WWW 2024

EarnMore: Portfolio Management in Customizable Stock Pools

A maskable stock representation framework that enables RL agents to handle arbitrary stock universes with a single trained model — eliminating the need to retrain per pool.

TWOSOME: True Knowledge Comes from Practice: Aligning LLMs with Embodied Environments via Reinforcement Learning

Aligns LLMs with interactive environments through reinforcement learning, enabling agents to acquire genuine knowledge through embodied practice rather than passive pretraining.


© 2026. Wentao Zhang. All rights reserved.

Powered by Hydejack v9.2.1