Category: Level 1
Task Name: Fact Retrieval
Brief Description: Require retrieving isolated knowledge points with minimal reasoning; mainly test precise keyword matching.
Example: Which region of France is Mont St. Michel located?
Zhishang Xiangโ 1, Chuanjie Wuโ 1, Qinggang Zhang* 1,
Shengyuan Chen2,Zijin Hong2, Xiao Huang2,Jinsong Su* 1
1Xiamen University 2The Hong Kong Polytechnic University
โ Equal contribution, *Corresponding author
[2025-06-06] We introduce GraphRAG-Bench, a comprehensive benchmark for evaluating Graph Retrieval-Augmented Generation models.
[2025-05-25] Leaderboard is on!
[2025-05-14] We release the GraphRAG-Bench dataset.
[2025-01-21] We release the GraphRAG survey.
Graph retrieval-augmented generation (GraphRAG) has emerged as a powerful paradigm for enhancing large
language models (LLMs) with external knowledge. It leverages graphs to model the hierarchical structure
between specific concepts, enabling more coherent and effective knowledge retrieval for accurate reasoning.
Despite its conceptual promise, recent studies report that GraphRAG frequently underperforms vanilla RAG on
many real-world tasks. This raises a critical question: Is GraphRAG really effective, and in which
scenarios
do graph structures provide measurable benefits for RAG systems? To address this, we propose
GraphRAG-Bench, a
comprehensive benchmark designed to evaluate GraphRAG models on both hierarchical knowledge retrieval and deep
contextual reasoning. GraphRAG-Bench features a comprehensive dataset with tasks of increasing difficulty,
covering fact retrieval, complex reasoning, contextual summarization, and creative generation, and a
systematic evaluation across the entire pipeline, from graph construction and knowledge retrieval to final
generation. Leveraging this novel benchmark, we systematically investigate the conditions when GraphRAG
surpasses traditional RAG and the underlying reasons for its success, offering guidelines for its practical
application.
Modelโ
|
Rankโ
|
Averageโ
|
Dateโ
|
Fact Retrieval | Complex Reasoning | Contextual Summarize | Creative Generation |
Link
|
|||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
ACCโ
|
ROUGE-Lโ
|
ACCโ
|
ROUGE-Lโ
|
ACCโ
|
Covโ
|
ACCโ
|
FSโ
|
Covโ
|
Modelโ
|
Rankโ
|
Averageโ
|
Dateโ
|
Fact Retrieval | Complex Reasoning | Contextual Summarize | Creative Generation |
Link
|
|||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
ACCโ
|
ROUGE-Lโ
|
ACCโ
|
ROUGE-Lโ
|
ACCโ
|
Covโ
|
ACCโ
|
FSโ
|
Covโ
|
Copy@misc{xiang2025usegraphsragcomprehensive, title={When to use Graphs in RAG: A Comprehensive Analysis for Graph Retrieval-Augmented Generation}, author={Zhishang Xiang and Chuanjie Wu and Qinggang Zhang and Shengyuan Chen and Zijin Hong and Xiao Huang and Jinsong Su}, year={2025}, eprint={2506.05690}, archivePrefix={arXiv}, primaryClass={cs.CL}, url={https://arxiv.org/abs/2506.05690}, }