File size: 2,416 Bytes
5d42805
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
# Document RAG User API

This is a FastAPI application for processing and managing document uploads, including PDF and text files. The application allows users to upload files, query collections, and manage their document data.

## Features

- Upload files in various formats (PDF, TXT, etc.)
- Efficiently process and store document chunks with metadata
- Perform queries on collections using user-defined input
- Retrieve and list collections specific to each user
- Remove collections as needed

## Requirements

- Python 3.7+
- FastAPI
- LanceDB
- Pydantic
- Pandas
- Other dependencies as specified in `requirements.txt`

## Installation

1. Clone the repository:
   ```bash

   git clone <repository-url>

   cd <repository-directory>

   ```

2. Install the required packages:
   ```bash

   pip install -r requirements.txt

   ```

3. Run the application:
   ```bash

   uvicorn app.document_rag_user:app --reload

   ```

## API Endpoints

### Upload Files

- **POST** `/upload_files`
  - Upload multiple files.
  - Parameters:
    - `files`: List of files to upload.
    - `collection_name`: Optional name for the collection.
    - `user_id`: User identifier.

### Get Document

- **GET** `/get_document/{collection_id}/{document_id}`
  - Retrieve a specific document by its ID from a collection.
  - Parameters:
    - `collection_id`: ID of the collection.
    - `document_id`: ID of the document.
    - `user_id`: User identifier.

### Query Collection

- **POST** `/query_collection`
  - Query a collection based on user input.
  - Request Body:
    - `collection_id`: ID of the collection.
    - `query`: Search query.
    - `top_k`: Optional number of top results to return (default is 3).
    - `user_id`: User identifier.

### List Collections

- **GET** `/list_collections`
  - List all collections for a specific user.
  - Parameters:
    - `user_id`: User identifier.

### Delete Collection

- **DELETE** `/delete_collection/{collection_id}`
  - Delete a specific collection.
  - Parameters:
    - `collection_id`: ID of the collection to delete.
    - `user_id`: User identifier.

## Contributing

Contributions are welcome! Please open an issue or submit a pull request for any improvements or bug fixes.

## License

This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.