Baidu Qianfan Deep Research Agent tops the authoritative evaluation list DeepResearch Bench

robot
Abstract generation in progress

On February 4th, DeepResearch Bench, an authoritative ranking of intelligent agents, announced its latest results. Baidu Qianfan DeepResearch Agent (Qianfan-DeepResearch Pro) topped the evaluation list due to its outstanding end-to-end research capabilities and high-quality report output. In the four core dimensions measuring the value of research reports—comprehensiveness, insightfulness, instruction adherence, and readability—Qianfan DeepResearch Agent achieved industry-leading performance.

Currently, DeepResearch is becoming a critical watershed in the evolution of artificial intelligence. Unlike traditional text generation, DeepResearch tasks require systems to autonomously perform multi-step, iterative cognitive tasks similar to human experts, covering the entire process from understanding complex needs and gathering extensive information to producing in-depth insights. The DeepResearch Agent is now widely used in academic reviews, financial research, business analysis, and other fields, reducing manual research work that traditionally takes days to just minutes, significantly improving research and decision-making efficiency.

As the “gold standard” for evaluating capabilities in this cutting-edge field, DeepResearch Bench fills the gap in general AI evaluation for end-to-end DeepResearch tasks. Existing benchmarks mainly focus on single abilities and struggle to cover the complexity of long-range reasoning and retrieval synthesis. The ranking was designed by domain experts with 100 doctoral-level research tasks across 22 disciplines, incorporating the RACE report quality evaluation framework and citation accuracy assessment. It is currently the most rigorous and authentic global evaluation system for measuring DeepResearch Agent productivity.

Qianfan DeepResearch Agent was able to stand out in this evaluation thanks to its excellent technical design. It adopts an Agentic architecture, implementing an end-to-end research delivery cycle through a “task understanding—planning—execution” loop, relying on Baidu Search and RAG technology to ensure breadth, credibility, and relevance of information retrieval. Two key design features ensure task execution accuracy: first, employing a “coarse to fine” research approach to handle task uncertainty; second, through deep execution path planning and real-time reflection mechanisms, the system can dynamically assess progress and adjust strategies at each research node, effectively avoiding hallucinations and path deviations, and ensuring high-quality completion of complex research tasks.

Additionally, during the report generation phase, Qianfan DeepResearch Agent uses a two-stage independent report rendering mechanism: first producing a pivot report, optimized for reasoning ability to ensure logical consistency and comprehensiveness; then, using different rendering tools, it generates final reports in multiple formats such as markdown, HTML, and PPT based on the pivot report, achieving “one-time research, multi-format report” delivery.

Currently, this DeepResearch Agent is live on Baidu Qianfan Platform. Users only need to input complex research requirements, and the system can generate professional-level research reports with citations within just ten minutes, truly delivering “minute-level” deep insights.

This ranking demonstrates the powerful support capabilities of Baidu Qianfan Agent Infra. Qianfan Agent Infra provides one-stop development services for models, tools, agent development, data, and agent runtime environments. The platform has developed over 1.3 million agents, with tools like Baidu’s exclusive “Baidu AI Search” averaging over tens of millions of calls daily.

View Original
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • Comment
  • Repost
  • Share
Comment
0/400
No comments
  • Pin

Trade Crypto Anywhere Anytime
qrCode
Scan to download Gate App
Community
  • 简体中文
  • English
  • Tiếng Việt
  • 繁體中文
  • Español
  • Русский
  • Français (Afrique)
  • Português (Portugal)
  • Bahasa Indonesia
  • 日本語
  • بالعربية
  • Українська
  • Português (Brasil)