FinWorld

An All-in-One Open-Source Platform for End-to-End Financial AI Research and Deployment

A comprehensive framework that seamlessly integrates diverse AI paradigms, heterogeneous data sources, and modern technologies to enable comprehensive financial AI development and evaluation.

arXiv:2508.02292 MIT License
FinWorld Architecture

Research Contributions

Unified Framework

We propose a unified, end-to-end framework for training and evaluation of ML, DL, RL, LLMs, and LLM agents, covering four critical financial AI task types including time series forecasting, algorithmic trading, portfolio management, and LLM applications.

Modular Design

The framework features a modular architecture that enables flexible construction of custom models and tasks, including the development of personalized LLM agents. The system supports efficient distributed training and testing across multiple environments.

Comprehensive Benchmark

We provide support for multimodal heterogeneous data with over 800 million samples, establishing a comprehensive benchmark for the financial AI community. Extensive experiments across four task types demonstrate the framework's flexibility and effectiveness.

Platform Comparison

FinWorld offers comprehensive support across all key features compared to existing platforms

Key Features

Multi-task Support

Time series forecasting, algorithmic trading, portfolio management, and LLM applications

Multimodal Data Integration

Structured market data, unstructured news, and multimodal information

Comprehensive AI Paradigms

ML, DL, RL, LLMs, and LLM agents with seamless integration

Advanced Automation

Distributed training, auto presentation, and experiment tracking

Architecture Overview

FinWorld employs a layered, object-oriented architecture with seven core layers

1

Configuration Layer

Built on mmengine for unified experiment management with registry mechanism and configuration inheritance

2

Dataset Layer

Multi-source data acquisition, feature engineering, task-specific organization, and RL environment encapsulation

3

Model Layer

ML models, DL architectures, RL networks, and unified LLM interface with financial constraints

4

Training Layer

Optimizers, loss functions, schedulers, metrics, and task-specific trainers with distributed support

5

Evaluation Layer

Financial-specific metrics, visualization tools, and standardized evaluation protocols

6

Task Layer

Time series forecasting, algorithmic trading, portfolio management, and LLM applications

7

Presentation Layer

Auto-reporting, multi-channel publishing, and version control for systematic archiving

Comprehensive Datasets

FinWorld provides extensive multimodal datasets with over 800 million samples across multiple markets and data types

Stock Market Data

Stock Market Dataset Overview

Market Coverage

  • Representative Markets: US (developed) and Chinese (emerging) markets
  • Stock Pools: DJ30, SP500, SSE50, HS300 indices
  • Data Sources: FMP, Alpaca, AKShare, TuShare providers
  • Data Types: Price data, news data, technical indicators
  • Coverage: 800M+ data points from 1995-2025

LLM Financial Reasoning Data

LLM Financial Reasoning Dataset

Reasoning Benchmarks

  • Financial QA: FinQA, FinEval, ConvFinQA datasets
  • Professional Exams: CFA, CPA, FRM, ACCA materials
  • Domain Knowledge: Business analysis and market reports
  • Multi-turn Dialogue: Conversational financial interactions
  • Coverage: 80k+ samples across diverse scenarios

Empirical Results

Time Series Forecasting

  • TimeXer achieves MAE of 0.0529 and MSE of 0.0062 on DJ30, significantly outperforming LightGBM (MAE: 0.1392, MSE: 0.0235)
  • TimeMixer and TimeXer show superior performance on HS300 with MAEs of 0.3804 and 0.3727 respectively
  • Deep learning models consistently achieve higher RankICIR scores compared to traditional ML methods

Algorithmic Trading

  • SAC achieves 101.55% ARR on TSLA with superior risk-adjusted returns
  • PPO attains 2.10 SR on META, outperforming all baseline methods
  • RL methods consistently deliver higher returns and better risk metrics across all evaluated stocks

Portfolio Management

  • SAC achieves up to 31.2% annualized returns on SP500 with Sharpe ratios above 1.5
  • RL methods consistently outperform rule-based and ML-based approaches
  • Superior risk-adjusted performance across all major indices (DJ30, SP500, SSE50, HS300)

LLM Financial Reasoning

LLM Financial Reasoning Results
  • FinReasoner leads all four financial reasoning benchmarks (FinQA, FinEval, ConvFinQA, CFLUE)
  • Domain-specific training outperforms generic instruction-tuned models
  • Comprehensive evaluation across multiple reasoning tasks

LLM Trading Performance

LLM Trading Performance Results
  • FinReasoner demonstrates strong trading capabilities across all evaluated stocks
  • Comprehensive risk management and performance metrics
  • Superior performance compared to other LLM models in trading tasks

Get Started

Quick Installation

# Create environment
conda create -n finworld python=3.11
conda activate finworld

# Install dependencies
make install-base
make install-browser
make install-verl

Quick Start

# Download data
python scripts/download/download.py --config configs/download/dj30/dj30_fmp_price_1day.py

# Train model
CUDA_VISIBLE_DEVICES=0 python scripts/rl_trading/train.py --config=configs/rl_trading/ppo/AAPL_ppo_trading.py