Spaces:
Sleeping
Sleeping
File size: 2,526 Bytes
68f18b5 8b52dd6 68f18b5 e46e844 68f18b5 e46e844 68f18b5 e46e844 68f18b5 8b52dd6 e46e844 68f18b5 8b52dd6 68f18b5 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 |
# SoccerRAG: Multimodal Soccer Information Retrieval via Natural Queries
## Abstract
The rapid evolution of digital sports media necessitates sophisticated information retrieval systems that can efficiently parse extensive multimodal datasets. This paper intro-
duces SoccerRAG, an innovative framework designed to harness the power of Retrieval Augmented Generation (RAG) and Large
Language Models (LLMs) to extract soccer-related information through natural language queries. By leveraging a multimodal
dataset, SoccerRAG supports dynamic querying and automatic data validation, enhancing user interaction and accessibility to
sports archives. Our evaluations indicate that SoccerRAG effectively handles complex queries, offering significant improvements
over traditional retrieval systems in terms of accuracy and
user engagement. The results underscore the potential of using
RAG and LLMs in sports analytics, paving the way for future
advancements in the accessibility and real-time processing of
sports data.
## Setup
````bash
pip install -r requirements.txt
````
Rename .env_demo to .env and fill in the required fields.
## Setting up the database
### Required data
The data required to run the code is not included in this repository.
The data can be downloaded from the [Soccernet](https://www.soccer-net.org/data).
Files needed are:
* Labels-v2.json [link](https://www.soccer-net.org/data#h.5klq86rmgt96)
* Labels-captions.json
The data should be placed in the ./data/Dataset/SoccerNet/ directory
For each league, create a new folder with the name of the leauge
For each season create a new folder with the name of the season (YYYY-YYYY)
For each game create a new folder with the name of the game (YYYY-MM-DD - HomeTeam Score - Score AwayTeam)
In each game folder, place the Labels-v2.json and Labels-captions.json files
### Setting up and populating the database
To set up the database, execute the following command:
````bash
python src/database.py
````
Adjust the path to the data in the database.py file as needed.
## Running the code
To run the code, execute the following command:
````bash
python main.py
````
The code will prompt you to enter a natural language query.
### Example query
````angular2html
Enter a query: How many goals has Messi scored each season?
Lionel Messi has scored the following number of goals each season:
- 2014-2015: 13 goals
- 2015-2016: 3 goals
- 2016-2017: 31 goals
````
## Results
![result-table.png](media%2Fresult-table.png)
## Acknowledgements
..
## Citation
..
|