Alpha Arena’s live crypto-trading benchmark showed Deepseek Chat V3.1 in first place on Saturday, Oct. 18, with the day’s leaderboard highlighting modest gains at the top and drawdowns across most rivals.
Deepseek Tops Leaderboard in Alpha Arena’s Real-Money Crypto Battle
Deepseek Chat V3.1 led the pack with a Hyperliquid account value of $10,400—a +4.0% return—after three completed trades. The bot paid $58.51 in fees, posted a 0% win rate on closed trades, and recorded the biggest loss of $348.33 versus a small negative “win” of $4.19, reflecting that active, unrealized positions are not counted until closed.
Grok-4 sat in second with $10,010 (+0.1%) and $0 in fees, logging no completed trades by the snapshot. Claude Sonnet 4.5 ranked third at $9,985 (-0.15%) with $42.63 in fees and three closed trades, showing the biggest loss of $88.38. The experiment highlights just how dramatically artificial intelligence (AI) has leveled up in recent years.

GPT-5 placed fourth at $9,901 (-0.99%) after two closed trades and $10.10 in fees, with its largest loss at $59.04. Gemini 2.5 Pro ranked fifth at $9,725 (-2.75%), paying the day’s highest fees ($106.46) across five trades; it showed the day’s largest single win ($329.35) but also a sizable $731.43 loss, yielding a 60% win rate on closed positions.
Qwen3 Max rounded out the field at $9,474 (-5.26%) with $44.62 in fees and one closed trade; the model’s biggest win and loss both printed -$517.77, indicating one notable losing outcome. Across the board, Sharpe readings were low or negative, consistent with limited trade counts and early-round noise rather than settled risk-adjusted performance.
Alpha Arena, launched Oct. 17 by research lab Nof1, allocates each model $10,000 to trade crypto perpetuals autonomously on the Hyperliquid decentralized exchange (DEX). The Alpha Arena public dashboard tracks account value, return, total P&L, fees, win rate, biggest win/loss, Sharpe, and trades, while excluding unrealized P&L until positions close—an important caveat when interpreting single-day standings.
Saturday’s snapshot on the Nof1.ai leaderboard points to the experiment’s premise: identical budgets, different LLM reasoning, and transparent execution. With several bots showing zero or few completed trades, early ranks may shift as open positions resolve and fee footprints compound. For now, Deepseek holds the edge, while Grok-4’s blank slate keeps it close, and Gemini’s mix of outsized wins and losses highlights higher variance.
FAQ
- What is Alpha Arena? A live benchmark where six LLMs trade crypto perpetuals autonomously with $10,000 each.
- Which model led on Oct. 18? Deepseek Chat V3.1 led with $10,400 (+4.0%) based on completed trades.
- Where do the trades occur? On the Hyperliquid decentralized exchange with transparent, on-chain tracking.
- Do standings include open P&L? No, only closed trades count; active positions update ranks once closed.