UltraRonin commited on
Commit
b6609fd
·
1 Parent(s): a7449d7
Files changed (1) hide show
  1. index.html +1 -1
index.html CHANGED
@@ -147,7 +147,7 @@ tr:hover a {
147
  <body>
148
  <h1>LR\({}^{2}\)Bench: Evaluating Long-chain Reflective Reasoning Capabilities of Large Language Models via Constraint Satisfaction Problems</h1>
149
  <p>
150
- descirption
151
  </p>
152
  <hr />
153
 
 
147
  <body>
148
  <h1>LR\({}^{2}\)Bench: Evaluating Long-chain Reflective Reasoning Capabilities of Large Language Models via Constraint Satisfaction Problems</h1>
149
  <p>
150
+ <strong>LR\({}^{2}\)Bench</strong> is a novel benchmark designed to evaluate the <strong>L</strong>ong-chain <strong>R</strong>eflective <strong>R</strong>easoning capabilities of LLMs. LR\({}^{2}\)Bench comprises 850 samples across six Constraint Satisfaction Problems (CSPs) where reflective reasoning is crucial for deriving solutions that meet all given constraints. Each type of task focuses on distinct constraint patterns, such as knowledge-based, logical, and spatial constraints, providing a comprehensive evaluation of diverse problem-solving scenarios.
151
  </p>
152
  <hr />
153