Thomas commited on
Commit
d8d07fe
·
1 Parent(s): 998e8ac

update api for audio task

Browse files
README.md CHANGED
@@ -1,71 +1,9 @@
1
- ---
2
- title: Submission Template
3
- emoji: 🔥
4
- colorFrom: yellow
5
- colorTo: green
6
- sdk: docker
7
- pinned: false
8
- ---
9
 
 
 
 
 
 
10
 
11
- # Random Baseline Model for Climate Disinformation Classification
12
 
13
- ## Model Description
14
-
15
- This is a random baseline model for the Frugal AI Challenge 2024, specifically for the text classification task of identifying climate disinformation. The model serves as a performance floor, randomly assigning labels to text inputs without any learning.
16
-
17
- ### Intended Use
18
-
19
- - **Primary intended uses**: Baseline comparison for climate disinformation classification models
20
- - **Primary intended users**: Researchers and developers participating in the Frugal AI Challenge
21
- - **Out-of-scope use cases**: Not intended for production use or real-world classification tasks
22
-
23
- ## Training Data
24
-
25
- The model uses the QuotaClimat/frugalaichallenge-text-train dataset:
26
- - Size: ~6000 examples
27
- - Split: 80% train, 20% test
28
- - 8 categories of climate disinformation claims
29
-
30
- ### Labels
31
- 0. No relevant claim detected
32
- 1. Global warming is not happening
33
- 2. Not caused by humans
34
- 3. Not bad or beneficial
35
- 4. Solutions harmful/unnecessary
36
- 5. Science is unreliable
37
- 6. Proponents are biased
38
- 7. Fossil fuels are needed
39
-
40
- ## Performance
41
-
42
- ### Metrics
43
- - **Accuracy**: ~12.5% (random chance with 8 classes)
44
- - **Environmental Impact**:
45
- - Emissions tracked in gCO2eq
46
- - Energy consumption tracked in Wh
47
-
48
- ### Model Architecture
49
- The model implements a random choice between the 8 possible labels, serving as the simplest possible baseline.
50
-
51
- ## Environmental Impact
52
-
53
- Environmental impact is tracked using CodeCarbon, measuring:
54
- - Carbon emissions during inference
55
- - Energy consumption during inference
56
-
57
- This tracking helps establish a baseline for the environmental impact of model deployment and inference.
58
-
59
- ## Limitations
60
- - Makes completely random predictions
61
- - No learning or pattern recognition
62
- - No consideration of input text
63
- - Serves only as a baseline reference
64
- - Not suitable for any real-world applications
65
-
66
- ## Ethical Considerations
67
-
68
- - Dataset contains sensitive topics related to climate disinformation
69
- - Model makes random predictions and should not be used for actual classification
70
- - Environmental impact is tracked to promote awareness of AI's carbon footprint
71
- ```
 
1
+ # Submission API
 
 
 
 
 
 
 
2
 
3
+ ## Dev locally
4
+ To develop locally:
5
+ - `docker build -t myname .`
6
+ - `docker run -d --name myname -p 7860:7860 myname`
7
+ - then access the api locally through: http://0.0.0.0:7860/
8
 
 
9
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
app.py CHANGED
@@ -1,6 +1,6 @@
1
  from fastapi import FastAPI
2
  from dotenv import load_dotenv
3
- from tasks import text, image, audio
4
 
5
  # Load environment variables
6
  load_dotenv()
@@ -11,8 +11,6 @@ app = FastAPI(
11
  )
12
 
13
  # Include all routers
14
- app.include_router(text.router)
15
- app.include_router(image.router)
16
  app.include_router(audio.router)
17
 
18
  @app.get("/")
@@ -20,8 +18,6 @@ async def root():
20
  return {
21
  "message": "Welcome to the Frugal AI Challenge API",
22
  "endpoints": {
23
- "text": "/text - Text classification task",
24
- "image": "/image - Image classification task (coming soon)",
25
  "audio": "/audio - Audio classification task (coming soon)"
26
  }
27
  }
 
1
  from fastapi import FastAPI
2
  from dotenv import load_dotenv
3
+ from tasks import audio
4
 
5
  # Load environment variables
6
  load_dotenv()
 
11
  )
12
 
13
  # Include all routers
 
 
14
  app.include_router(audio.router)
15
 
16
  @app.get("/")
 
18
  return {
19
  "message": "Welcome to the Frugal AI Challenge API",
20
  "endpoints": {
 
 
21
  "audio": "/audio - Audio classification task (coming soon)"
22
  }
23
  }
notebooks/template-image.ipynb DELETED
@@ -1,416 +0,0 @@
1
- {
2
- "cells": [
3
- {
4
- "cell_type": "markdown",
5
- "metadata": {},
6
- "source": [
7
- "# Image task notebook template\n",
8
- "## Loading the necessary libraries"
9
- ]
10
- },
11
- {
12
- "cell_type": "code",
13
- "execution_count": 13,
14
- "metadata": {},
15
- "outputs": [],
16
- "source": [
17
- "from fastapi import APIRouter\n",
18
- "from datetime import datetime\n",
19
- "from datasets import load_dataset\n",
20
- "from sklearn.metrics import accuracy_score, precision_score, recall_score\n",
21
- "\n",
22
- "import random\n",
23
- "\n",
24
- "import sys\n",
25
- "sys.path.append('../')\n",
26
- "\n",
27
- "from tasks.utils.evaluation import ImageEvaluationRequest\n",
28
- "from tasks.utils.emissions import tracker, clean_emissions_data, get_space_info\n",
29
- "from tasks.image import parse_boxes,compute_iou,compute_max_iou"
30
- ]
31
- },
32
- {
33
- "cell_type": "markdown",
34
- "metadata": {},
35
- "source": [
36
- "## Loading the datasets and splitting them"
37
- ]
38
- },
39
- {
40
- "cell_type": "code",
41
- "execution_count": 4,
42
- "metadata": {},
43
- "outputs": [
44
- {
45
- "data": {
46
- "application/vnd.jupyter.widget-view+json": {
47
- "model_id": "4f62b23ca587477d9f37430e687bf951",
48
- "version_major": 2,
49
- "version_minor": 0
50
- },
51
- "text/plain": [
52
- "README.md: 0%| | 0.00/7.72k [00:00<?, ?B/s]"
53
- ]
54
- },
55
- "metadata": {},
56
- "output_type": "display_data"
57
- },
58
- {
59
- "name": "stderr",
60
- "output_type": "stream",
61
- "text": [
62
- "c:\\Users\\theo.alvesdacosta\\AppData\\Local\\anaconda3\\Lib\\site-packages\\huggingface_hub\\file_download.py:139: UserWarning: `huggingface_hub` cache-system uses symlinks by default to efficiently store duplicated files but your machine does not support them in C:\\Users\\theo.alvesdacosta\\.cache\\huggingface\\hub\\datasets--pyronear--pyro-sdis. Caching files will still work but in a degraded version that might require more space on your disk. This warning can be disabled by setting the `HF_HUB_DISABLE_SYMLINKS_WARNING` environment variable. For more details, see https://huggingface.co/docs/huggingface_hub/how-to-cache#limitations.\n",
63
- "To support symlinks on Windows, you either need to activate Developer Mode or to run Python as an administrator. In order to activate developer mode, see this article: https://docs.microsoft.com/en-us/windows/apps/get-started/enable-your-device-for-development\n",
64
- " warnings.warn(message)\n"
65
- ]
66
- },
67
- {
68
- "data": {
69
- "application/vnd.jupyter.widget-view+json": {
70
- "model_id": "70735dd748e343119b5a7cd966dcd0f0",
71
- "version_major": 2,
72
- "version_minor": 0
73
- },
74
- "text/plain": [
75
- "train-00000-of-00007.parquet: 0%| | 0.00/433M [00:00<?, ?B/s]"
76
- ]
77
- },
78
- "metadata": {},
79
- "output_type": "display_data"
80
- },
81
- {
82
- "data": {
83
- "application/vnd.jupyter.widget-view+json": {
84
- "model_id": "903c3227c24649f1a0424e039d74d303",
85
- "version_major": 2,
86
- "version_minor": 0
87
- },
88
- "text/plain": [
89
- "train-00001-of-00007.parquet: 0%| | 0.00/434M [00:00<?, ?B/s]"
90
- ]
91
- },
92
- "metadata": {},
93
- "output_type": "display_data"
94
- },
95
- {
96
- "data": {
97
- "application/vnd.jupyter.widget-view+json": {
98
- "model_id": "8795b7696f124715b9d52287d5cd4ee0",
99
- "version_major": 2,
100
- "version_minor": 0
101
- },
102
- "text/plain": [
103
- "train-00002-of-00007.parquet: 0%| | 0.00/432M [00:00<?, ?B/s]"
104
- ]
105
- },
106
- "metadata": {},
107
- "output_type": "display_data"
108
- },
109
- {
110
- "data": {
111
- "application/vnd.jupyter.widget-view+json": {
112
- "model_id": "4b6c1240bf024d61bf913584d13834f5",
113
- "version_major": 2,
114
- "version_minor": 0
115
- },
116
- "text/plain": [
117
- "train-00003-of-00007.parquet: 0%| | 0.00/428M [00:00<?, ?B/s]"
118
- ]
119
- },
120
- "metadata": {},
121
- "output_type": "display_data"
122
- },
123
- {
124
- "data": {
125
- "application/vnd.jupyter.widget-view+json": {
126
- "model_id": "cd5f8172a31f4fd79d489db96ede9c21",
127
- "version_major": 2,
128
- "version_minor": 0
129
- },
130
- "text/plain": [
131
- "train-00004-of-00007.parquet: 0%| | 0.00/431M [00:00<?, ?B/s]"
132
- ]
133
- },
134
- "metadata": {},
135
- "output_type": "display_data"
136
- },
137
- {
138
- "data": {
139
- "application/vnd.jupyter.widget-view+json": {
140
- "model_id": "416af82dba3a4ab7ad13190703c90757",
141
- "version_major": 2,
142
- "version_minor": 0
143
- },
144
- "text/plain": [
145
- "train-00005-of-00007.parquet: 0%| | 0.00/429M [00:00<?, ?B/s]"
146
- ]
147
- },
148
- "metadata": {},
149
- "output_type": "display_data"
150
- },
151
- {
152
- "data": {
153
- "application/vnd.jupyter.widget-view+json": {
154
- "model_id": "6819ad85508641a1a64bea34303446ac",
155
- "version_major": 2,
156
- "version_minor": 0
157
- },
158
- "text/plain": [
159
- "train-00006-of-00007.parquet: 0%| | 0.00/431M [00:00<?, ?B/s]"
160
- ]
161
- },
162
- "metadata": {},
163
- "output_type": "display_data"
164
- },
165
- {
166
- "data": {
167
- "application/vnd.jupyter.widget-view+json": {
168
- "model_id": "90a7f85c802b4330b502c8bbd3cca7f9",
169
- "version_major": 2,
170
- "version_minor": 0
171
- },
172
- "text/plain": [
173
- "val-00000-of-00001.parquet: 0%| | 0.00/407M [00:00<?, ?B/s]"
174
- ]
175
- },
176
- "metadata": {},
177
- "output_type": "display_data"
178
- },
179
- {
180
- "data": {
181
- "application/vnd.jupyter.widget-view+json": {
182
- "model_id": "b93f2f19aafb43e2b8db0fd7bb3ebd34",
183
- "version_major": 2,
184
- "version_minor": 0
185
- },
186
- "text/plain": [
187
- "Generating train split: 0%| | 0/29537 [00:00<?, ? examples/s]"
188
- ]
189
- },
190
- "metadata": {},
191
- "output_type": "display_data"
192
- },
193
- {
194
- "data": {
195
- "application/vnd.jupyter.widget-view+json": {
196
- "model_id": "c14c0f2cde184c959970dfccaa26b2d2",
197
- "version_major": 2,
198
- "version_minor": 0
199
- },
200
- "text/plain": [
201
- "Generating val split: 0%| | 0/4099 [00:00<?, ? examples/s]"
202
- ]
203
- },
204
- "metadata": {},
205
- "output_type": "display_data"
206
- }
207
- ],
208
- "source": [
209
- "request = ImageEvaluationRequest()\n",
210
- "\n",
211
- "# Load and prepare the dataset\n",
212
- "dataset = load_dataset(request.dataset_name)\n",
213
- "\n",
214
- "# Split dataset\n",
215
- "train_test = dataset[\"train\"].train_test_split(test_size=request.test_size, seed=request.test_seed)\n",
216
- "test_dataset = train_test[\"test\"]"
217
- ]
218
- },
219
- {
220
- "cell_type": "markdown",
221
- "metadata": {},
222
- "source": [
223
- "## Random Baseline"
224
- ]
225
- },
226
- {
227
- "cell_type": "code",
228
- "execution_count": 10,
229
- "metadata": {},
230
- "outputs": [],
231
- "source": [
232
- "# Start tracking emissions\n",
233
- "tracker.start()\n",
234
- "tracker.start_task(\"inference\")"
235
- ]
236
- },
237
- {
238
- "cell_type": "code",
239
- "execution_count": 11,
240
- "metadata": {},
241
- "outputs": [],
242
- "source": [
243
- "\n",
244
- "#--------------------------------------------------------------------------------------------\n",
245
- "# YOUR MODEL INFERENCE CODE HERE\n",
246
- "# Update the code below to replace the random baseline by your model inference within the inference pass where the energy consumption and emissions are tracked.\n",
247
- "#-------------------------------------------------------------------------------------------- \n",
248
- "\n",
249
- "# Make random predictions (placeholder for actual model inference)\n",
250
- "\n",
251
- "predictions = []\n",
252
- "true_labels = []\n",
253
- "pred_boxes = []\n",
254
- "true_boxes_list = [] # List of lists, each inner list contains boxes for one image\n",
255
- "\n",
256
- "for example in test_dataset:\n",
257
- " # Parse true annotation (YOLO format: class_id x_center y_center width height)\n",
258
- " annotation = example.get(\"annotations\", \"\").strip()\n",
259
- " has_smoke = len(annotation) > 0\n",
260
- " true_labels.append(int(has_smoke))\n",
261
- " \n",
262
- " # Make random classification prediction\n",
263
- " pred_has_smoke = random.random() > 0.5\n",
264
- " predictions.append(int(pred_has_smoke))\n",
265
- " \n",
266
- " # If there's a true box, parse it and make random box prediction\n",
267
- " if has_smoke:\n",
268
- " # Parse all true boxes from the annotation\n",
269
- " image_true_boxes = parse_boxes(annotation)\n",
270
- " true_boxes_list.append(image_true_boxes)\n",
271
- " \n",
272
- " # For baseline, make one random box prediction per image\n",
273
- " # In a real model, you might want to predict multiple boxes\n",
274
- " random_box = [\n",
275
- " random.random(), # x_center\n",
276
- " random.random(), # y_center\n",
277
- " random.random() * 0.5, # width (max 0.5)\n",
278
- " random.random() * 0.5 # height (max 0.5)\n",
279
- " ]\n",
280
- " pred_boxes.append(random_box)\n",
281
- "\n",
282
- "\n",
283
- "#--------------------------------------------------------------------------------------------\n",
284
- "# YOUR MODEL INFERENCE STOPS HERE\n",
285
- "#-------------------------------------------------------------------------------------------- "
286
- ]
287
- },
288
- {
289
- "cell_type": "code",
290
- "execution_count": null,
291
- "metadata": {},
292
- "outputs": [],
293
- "source": [
294
- "# Stop tracking emissions\n",
295
- "emissions_data = tracker.stop_task()"
296
- ]
297
- },
298
- {
299
- "cell_type": "code",
300
- "execution_count": 15,
301
- "metadata": {},
302
- "outputs": [],
303
- "source": [
304
- "import numpy as np\n",
305
- "\n",
306
- "# Calculate classification metrics\n",
307
- "classification_accuracy = accuracy_score(true_labels, predictions)\n",
308
- "classification_precision = precision_score(true_labels, predictions)\n",
309
- "classification_recall = recall_score(true_labels, predictions)\n",
310
- "\n",
311
- "# Calculate mean IoU for object detection (only for images with smoke)\n",
312
- "# For each image, we compute the max IoU between the predicted box and all true boxes\n",
313
- "ious = []\n",
314
- "for true_boxes, pred_box in zip(true_boxes_list, pred_boxes):\n",
315
- " max_iou = compute_max_iou(true_boxes, pred_box)\n",
316
- " ious.append(max_iou)\n",
317
- "\n",
318
- "mean_iou = float(np.mean(ious)) if ious else 0.0"
319
- ]
320
- },
321
- {
322
- "cell_type": "code",
323
- "execution_count": 18,
324
- "metadata": {},
325
- "outputs": [
326
- {
327
- "data": {
328
- "text/plain": [
329
- "{'submission_timestamp': '2025-01-22T15:57:37.288173',\n",
330
- " 'classification_accuracy': 0.5001692620176033,\n",
331
- " 'classification_precision': 0.8397129186602871,\n",
332
- " 'classification_recall': 0.4972677595628415,\n",
333
- " 'mean_iou': 0.002819781629108398,\n",
334
- " 'energy_consumed_wh': 0.779355299496116,\n",
335
- " 'emissions_gco2eq': 0.043674291628462855,\n",
336
- " 'emissions_data': {'run_id': '4e750cd5-60f0-444c-baee-b5f7b31f784b',\n",
337
- " 'duration': 51.72819679998793,\n",
338
- " 'emissions': 4.3674291628462856e-05,\n",
339
- " 'emissions_rate': 8.445163379568943e-07,\n",
340
- " 'cpu_power': 42.5,\n",
341
- " 'gpu_power': 0.0,\n",
342
- " 'ram_power': 11.755242347717285,\n",
343
- " 'cpu_energy': 0.0006104993474311617,\n",
344
- " 'gpu_energy': 0,\n",
345
- " 'ram_energy': 0.00016885595206495442,\n",
346
- " 'energy_consumed': 0.0007793552994961161,\n",
347
- " 'country_name': 'France',\n",
348
- " 'country_iso_code': 'FRA',\n",
349
- " 'region': 'île-de-france',\n",
350
- " 'cloud_provider': '',\n",
351
- " 'cloud_region': '',\n",
352
- " 'os': 'Windows-11-10.0.22631-SP0',\n",
353
- " 'python_version': '3.12.7',\n",
354
- " 'codecarbon_version': '3.0.0_rc0',\n",
355
- " 'cpu_count': 12,\n",
356
- " 'cpu_model': '13th Gen Intel(R) Core(TM) i7-1365U',\n",
357
- " 'gpu_count': None,\n",
358
- " 'gpu_model': None,\n",
359
- " 'ram_total_size': 31.347312927246094,\n",
360
- " 'tracking_mode': 'machine',\n",
361
- " 'on_cloud': 'N',\n",
362
- " 'pue': 1.0},\n",
363
- " 'dataset_config': {'dataset_name': 'pyronear/pyro-sdis',\n",
364
- " 'test_size': 0.2,\n",
365
- " 'test_seed': 42}}"
366
- ]
367
- },
368
- "execution_count": 18,
369
- "metadata": {},
370
- "output_type": "execute_result"
371
- }
372
- ],
373
- "source": [
374
- "\n",
375
- "# Prepare results dictionary\n",
376
- "results = {\n",
377
- " \"submission_timestamp\": datetime.now().isoformat(),\n",
378
- " \"classification_accuracy\": float(classification_accuracy),\n",
379
- " \"classification_precision\": float(classification_precision),\n",
380
- " \"classification_recall\": float(classification_recall),\n",
381
- " \"mean_iou\": mean_iou,\n",
382
- " \"energy_consumed_wh\": emissions_data.energy_consumed * 1000,\n",
383
- " \"emissions_gco2eq\": emissions_data.emissions * 1000,\n",
384
- " \"emissions_data\": clean_emissions_data(emissions_data),\n",
385
- " \"dataset_config\": {\n",
386
- " \"dataset_name\": request.dataset_name,\n",
387
- " \"test_size\": request.test_size,\n",
388
- " \"test_seed\": request.test_seed\n",
389
- " }\n",
390
- "}\n",
391
- "results"
392
- ]
393
- }
394
- ],
395
- "metadata": {
396
- "kernelspec": {
397
- "display_name": "base",
398
- "language": "python",
399
- "name": "python3"
400
- },
401
- "language_info": {
402
- "codemirror_mode": {
403
- "name": "ipython",
404
- "version": 3
405
- },
406
- "file_extension": ".py",
407
- "mimetype": "text/x-python",
408
- "name": "python",
409
- "nbconvert_exporter": "python",
410
- "pygments_lexer": "ipython3",
411
- "version": "3.12.7"
412
- }
413
- },
414
- "nbformat": 4,
415
- "nbformat_minor": 2
416
- }
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
notebooks/template-text.ipynb DELETED
@@ -1,1642 +0,0 @@
1
- {
2
- "cells": [
3
- {
4
- "cell_type": "markdown",
5
- "metadata": {},
6
- "source": [
7
- "# Text task notebook template\n",
8
- "## Loading the necessary libraries"
9
- ]
10
- },
11
- {
12
- "cell_type": "code",
13
- "execution_count": 3,
14
- "metadata": {},
15
- "outputs": [
16
- {
17
- "name": "stderr",
18
- "output_type": "stream",
19
- "text": [
20
- "[codecarbon WARNING @ 19:48:07] Multiple instances of codecarbon are allowed to run at the same time.\n",
21
- "[codecarbon INFO @ 19:48:07] [setup] RAM Tracking...\n",
22
- "[codecarbon INFO @ 19:48:07] [setup] CPU Tracking...\n",
23
- "[codecarbon WARNING @ 19:48:09] We saw that you have a 13th Gen Intel(R) Core(TM) i7-1365U but we don't know it. Please contact us.\n",
24
- "[codecarbon WARNING @ 19:48:09] No CPU tracking mode found. Falling back on CPU constant mode. \n",
25
- " Windows OS detected: Please install Intel Power Gadget to measure CPU\n",
26
- "\n",
27
- "[codecarbon WARNING @ 19:48:11] We saw that you have a 13th Gen Intel(R) Core(TM) i7-1365U but we don't know it. Please contact us.\n",
28
- "[codecarbon INFO @ 19:48:11] CPU Model on constant consumption mode: 13th Gen Intel(R) Core(TM) i7-1365U\n",
29
- "[codecarbon WARNING @ 19:48:11] No CPU tracking mode found. Falling back on CPU constant mode.\n",
30
- "[codecarbon INFO @ 19:48:11] [setup] GPU Tracking...\n",
31
- "[codecarbon INFO @ 19:48:11] No GPU found.\n",
32
- "[codecarbon INFO @ 19:48:11] >>> Tracker's metadata:\n",
33
- "[codecarbon INFO @ 19:48:11] Platform system: Windows-11-10.0.22631-SP0\n",
34
- "[codecarbon INFO @ 19:48:11] Python version: 3.12.7\n",
35
- "[codecarbon INFO @ 19:48:11] CodeCarbon version: 3.0.0_rc0\n",
36
- "[codecarbon INFO @ 19:48:11] Available RAM : 31.347 GB\n",
37
- "[codecarbon INFO @ 19:48:11] CPU count: 12\n",
38
- "[codecarbon INFO @ 19:48:11] CPU model: 13th Gen Intel(R) Core(TM) i7-1365U\n",
39
- "[codecarbon INFO @ 19:48:11] GPU count: None\n",
40
- "[codecarbon INFO @ 19:48:11] GPU model: None\n",
41
- "[codecarbon INFO @ 19:48:11] Saving emissions data to file c:\\git\\submission-template\\notebooks\\emissions.csv\n"
42
- ]
43
- }
44
- ],
45
- "source": [
46
- "from fastapi import APIRouter\n",
47
- "from datetime import datetime\n",
48
- "from datasets import load_dataset\n",
49
- "from sklearn.metrics import accuracy_score\n",
50
- "import random\n",
51
- "\n",
52
- "import sys\n",
53
- "sys.path.append('../tasks')\n",
54
- "\n",
55
- "from utils.evaluation import TextEvaluationRequest\n",
56
- "from utils.emissions import tracker, clean_emissions_data, get_space_info\n",
57
- "\n",
58
- "\n",
59
- "# Define the label mapping\n",
60
- "LABEL_MAPPING = {\n",
61
- " \"0_not_relevant\": 0,\n",
62
- " \"1_not_happening\": 1,\n",
63
- " \"2_not_human\": 2,\n",
64
- " \"3_not_bad\": 3,\n",
65
- " \"4_solutions_harmful_unnecessary\": 4,\n",
66
- " \"5_science_unreliable\": 5,\n",
67
- " \"6_proponents_biased\": 6,\n",
68
- " \"7_fossil_fuels_needed\": 7\n",
69
- "}"
70
- ]
71
- },
72
- {
73
- "cell_type": "markdown",
74
- "metadata": {},
75
- "source": [
76
- "## Loading the datasets and splitting them"
77
- ]
78
- },
79
- {
80
- "cell_type": "code",
81
- "execution_count": 4,
82
- "metadata": {},
83
- "outputs": [
84
- {
85
- "data": {
86
- "application/vnd.jupyter.widget-view+json": {
87
- "model_id": "668da7bf85434e098b95c3ec447d78fe",
88
- "version_major": 2,
89
- "version_minor": 0
90
- },
91
- "text/plain": [
92
- "README.md: 0%| | 0.00/5.18k [00:00<?, ?B/s]"
93
- ]
94
- },
95
- "metadata": {},
96
- "output_type": "display_data"
97
- },
98
- {
99
- "name": "stderr",
100
- "output_type": "stream",
101
- "text": [
102
- "c:\\Users\\theo.alvesdacosta\\AppData\\Local\\anaconda3\\Lib\\site-packages\\huggingface_hub\\file_download.py:139: UserWarning: `huggingface_hub` cache-system uses symlinks by default to efficiently store duplicated files but your machine does not support them in C:\\Users\\theo.alvesdacosta\\.cache\\huggingface\\hub\\datasets--QuotaClimat--frugalaichallenge-text-train. Caching files will still work but in a degraded version that might require more space on your disk. This warning can be disabled by setting the `HF_HUB_DISABLE_SYMLINKS_WARNING` environment variable. For more details, see https://huggingface.co/docs/huggingface_hub/how-to-cache#limitations.\n",
103
- "To support symlinks on Windows, you either need to activate Developer Mode or to run Python as an administrator. In order to activate developer mode, see this article: https://docs.microsoft.com/en-us/windows/apps/get-started/enable-your-device-for-development\n",
104
- " warnings.warn(message)\n"
105
- ]
106
- },
107
- {
108
- "data": {
109
- "application/vnd.jupyter.widget-view+json": {
110
- "model_id": "5b68d43359eb429395da8be7d4b15556",
111
- "version_major": 2,
112
- "version_minor": 0
113
- },
114
- "text/plain": [
115
- "train.parquet: 0%| | 0.00/1.21M [00:00<?, ?B/s]"
116
- ]
117
- },
118
- "metadata": {},
119
- "output_type": "display_data"
120
- },
121
- {
122
- "data": {
123
- "application/vnd.jupyter.widget-view+json": {
124
- "model_id": "140a304773914e9db8f698eabeb40298",
125
- "version_major": 2,
126
- "version_minor": 0
127
- },
128
- "text/plain": [
129
- "Generating train split: 0%| | 0/6091 [00:00<?, ? examples/s]"
130
- ]
131
- },
132
- "metadata": {},
133
- "output_type": "display_data"
134
- },
135
- {
136
- "data": {
137
- "application/vnd.jupyter.widget-view+json": {
138
- "model_id": "6d04e8ab1906400e8e0029949dc523a5",
139
- "version_major": 2,
140
- "version_minor": 0
141
- },
142
- "text/plain": [
143
- "Map: 0%| | 0/6091 [00:00<?, ? examples/s]"
144
- ]
145
- },
146
- "metadata": {},
147
- "output_type": "display_data"
148
- }
149
- ],
150
- "source": [
151
- "request = TextEvaluationRequest()\n",
152
- "\n",
153
- "# Load and prepare the dataset\n",
154
- "dataset = load_dataset(request.dataset_name)\n",
155
- "\n",
156
- "# Convert string labels to integers\n",
157
- "dataset = dataset.map(lambda x: {\"label\": LABEL_MAPPING[x[\"label\"]]})\n",
158
- "\n",
159
- "# Split dataset\n",
160
- "train_test = dataset[\"train\"].train_test_split(test_size=request.test_size, seed=request.test_seed)\n",
161
- "test_dataset = train_test[\"test\"]"
162
- ]
163
- },
164
- {
165
- "cell_type": "markdown",
166
- "metadata": {},
167
- "source": [
168
- "## Random Baseline"
169
- ]
170
- },
171
- {
172
- "cell_type": "code",
173
- "execution_count": 5,
174
- "metadata": {},
175
- "outputs": [],
176
- "source": [
177
- "# Start tracking emissions\n",
178
- "tracker.start()\n",
179
- "tracker.start_task(\"inference\")"
180
- ]
181
- },
182
- {
183
- "cell_type": "code",
184
- "execution_count": 6,
185
- "metadata": {},
186
- "outputs": [
187
- {
188
- "data": {
189
- "text/plain": [
190
- "[1,\n",
191
- " 7,\n",
192
- " 6,\n",
193
- " 6,\n",
194
- " 2,\n",
195
- " 0,\n",
196
- " 1,\n",
197
- " 7,\n",
198
- " 3,\n",
199
- " 6,\n",
200
- " 6,\n",
201
- " 3,\n",
202
- " 6,\n",
203
- " 6,\n",
204
- " 5,\n",
205
- " 0,\n",
206
- " 2,\n",
207
- " 6,\n",
208
- " 2,\n",
209
- " 6,\n",
210
- " 5,\n",
211
- " 4,\n",
212
- " 1,\n",
213
- " 3,\n",
214
- " 6,\n",
215
- " 4,\n",
216
- " 2,\n",
217
- " 1,\n",
218
- " 4,\n",
219
- " 0,\n",
220
- " 3,\n",
221
- " 4,\n",
222
- " 1,\n",
223
- " 5,\n",
224
- " 5,\n",
225
- " 1,\n",
226
- " 2,\n",
227
- " 7,\n",
228
- " 6,\n",
229
- " 1,\n",
230
- " 3,\n",
231
- " 1,\n",
232
- " 7,\n",
233
- " 7,\n",
234
- " 0,\n",
235
- " 0,\n",
236
- " 3,\n",
237
- " 3,\n",
238
- " 3,\n",
239
- " 4,\n",
240
- " 1,\n",
241
- " 4,\n",
242
- " 4,\n",
243
- " 1,\n",
244
- " 4,\n",
245
- " 5,\n",
246
- " 6,\n",
247
- " 1,\n",
248
- " 2,\n",
249
- " 2,\n",
250
- " 2,\n",
251
- " 5,\n",
252
- " 2,\n",
253
- " 7,\n",
254
- " 2,\n",
255
- " 7,\n",
256
- " 7,\n",
257
- " 6,\n",
258
- " 4,\n",
259
- " 2,\n",
260
- " 0,\n",
261
- " 1,\n",
262
- " 6,\n",
263
- " 3,\n",
264
- " 2,\n",
265
- " 5,\n",
266
- " 5,\n",
267
- " 2,\n",
268
- " 0,\n",
269
- " 7,\n",
270
- " 0,\n",
271
- " 1,\n",
272
- " 5,\n",
273
- " 5,\n",
274
- " 7,\n",
275
- " 4,\n",
276
- " 6,\n",
277
- " 7,\n",
278
- " 1,\n",
279
- " 7,\n",
280
- " 1,\n",
281
- " 0,\n",
282
- " 3,\n",
283
- " 4,\n",
284
- " 2,\n",
285
- " 5,\n",
286
- " 3,\n",
287
- " 3,\n",
288
- " 3,\n",
289
- " 2,\n",
290
- " 2,\n",
291
- " 1,\n",
292
- " 0,\n",
293
- " 4,\n",
294
- " 5,\n",
295
- " 7,\n",
296
- " 0,\n",
297
- " 3,\n",
298
- " 1,\n",
299
- " 4,\n",
300
- " 6,\n",
301
- " 0,\n",
302
- " 7,\n",
303
- " 1,\n",
304
- " 1,\n",
305
- " 2,\n",
306
- " 2,\n",
307
- " 4,\n",
308
- " 0,\n",
309
- " 4,\n",
310
- " 3,\n",
311
- " 4,\n",
312
- " 4,\n",
313
- " 2,\n",
314
- " 2,\n",
315
- " 3,\n",
316
- " 3,\n",
317
- " 7,\n",
318
- " 4,\n",
319
- " 7,\n",
320
- " 6,\n",
321
- " 4,\n",
322
- " 5,\n",
323
- " 4,\n",
324
- " 3,\n",
325
- " 6,\n",
326
- " 0,\n",
327
- " 4,\n",
328
- " 0,\n",
329
- " 1,\n",
330
- " 3,\n",
331
- " 6,\n",
332
- " 7,\n",
333
- " 3,\n",
334
- " 3,\n",
335
- " 0,\n",
336
- " 1,\n",
337
- " 2,\n",
338
- " 4,\n",
339
- " 4,\n",
340
- " 3,\n",
341
- " 1,\n",
342
- " 2,\n",
343
- " 4,\n",
344
- " 3,\n",
345
- " 0,\n",
346
- " 5,\n",
347
- " 3,\n",
348
- " 6,\n",
349
- " 3,\n",
350
- " 6,\n",
351
- " 1,\n",
352
- " 3,\n",
353
- " 4,\n",
354
- " 5,\n",
355
- " 4,\n",
356
- " 0,\n",
357
- " 7,\n",
358
- " 3,\n",
359
- " 6,\n",
360
- " 7,\n",
361
- " 4,\n",
362
- " 4,\n",
363
- " 5,\n",
364
- " 3,\n",
365
- " 1,\n",
366
- " 7,\n",
367
- " 4,\n",
368
- " 1,\n",
369
- " 0,\n",
370
- " 3,\n",
371
- " 0,\n",
372
- " 5,\n",
373
- " 3,\n",
374
- " 6,\n",
375
- " 3,\n",
376
- " 0,\n",
377
- " 7,\n",
378
- " 2,\n",
379
- " 0,\n",
380
- " 4,\n",
381
- " 1,\n",
382
- " 2,\n",
383
- " 6,\n",
384
- " 3,\n",
385
- " 4,\n",
386
- " 4,\n",
387
- " 5,\n",
388
- " 1,\n",
389
- " 5,\n",
390
- " 4,\n",
391
- " 0,\n",
392
- " 1,\n",
393
- " 7,\n",
394
- " 3,\n",
395
- " 6,\n",
396
- " 0,\n",
397
- " 7,\n",
398
- " 4,\n",
399
- " 6,\n",
400
- " 3,\n",
401
- " 0,\n",
402
- " 0,\n",
403
- " 4,\n",
404
- " 6,\n",
405
- " 6,\n",
406
- " 4,\n",
407
- " 0,\n",
408
- " 5,\n",
409
- " 7,\n",
410
- " 5,\n",
411
- " 1,\n",
412
- " 3,\n",
413
- " 6,\n",
414
- " 2,\n",
415
- " 3,\n",
416
- " 2,\n",
417
- " 4,\n",
418
- " 5,\n",
419
- " 1,\n",
420
- " 5,\n",
421
- " 0,\n",
422
- " 3,\n",
423
- " 3,\n",
424
- " 0,\n",
425
- " 0,\n",
426
- " 6,\n",
427
- " 6,\n",
428
- " 2,\n",
429
- " 0,\n",
430
- " 7,\n",
431
- " 4,\n",
432
- " 5,\n",
433
- " 7,\n",
434
- " 1,\n",
435
- " 0,\n",
436
- " 4,\n",
437
- " 5,\n",
438
- " 1,\n",
439
- " 7,\n",
440
- " 0,\n",
441
- " 7,\n",
442
- " 2,\n",
443
- " 6,\n",
444
- " 1,\n",
445
- " 3,\n",
446
- " 5,\n",
447
- " 5,\n",
448
- " 6,\n",
449
- " 5,\n",
450
- " 4,\n",
451
- " 3,\n",
452
- " 7,\n",
453
- " 4,\n",
454
- " 3,\n",
455
- " 5,\n",
456
- " 5,\n",
457
- " 7,\n",
458
- " 2,\n",
459
- " 6,\n",
460
- " 1,\n",
461
- " 5,\n",
462
- " 0,\n",
463
- " 3,\n",
464
- " 4,\n",
465
- " 2,\n",
466
- " 3,\n",
467
- " 7,\n",
468
- " 0,\n",
469
- " 1,\n",
470
- " 7,\n",
471
- " 6,\n",
472
- " 7,\n",
473
- " 7,\n",
474
- " 5,\n",
475
- " 6,\n",
476
- " 3,\n",
477
- " 2,\n",
478
- " 3,\n",
479
- " 0,\n",
480
- " 4,\n",
481
- " 3,\n",
482
- " 5,\n",
483
- " 6,\n",
484
- " 0,\n",
485
- " 0,\n",
486
- " 6,\n",
487
- " 6,\n",
488
- " 1,\n",
489
- " 4,\n",
490
- " 0,\n",
491
- " 4,\n",
492
- " 2,\n",
493
- " 7,\n",
494
- " 5,\n",
495
- " 7,\n",
496
- " 6,\n",
497
- " 3,\n",
498
- " 5,\n",
499
- " 6,\n",
500
- " 0,\n",
501
- " 4,\n",
502
- " 5,\n",
503
- " 6,\n",
504
- " 1,\n",
505
- " 2,\n",
506
- " 1,\n",
507
- " 5,\n",
508
- " 3,\n",
509
- " 0,\n",
510
- " 3,\n",
511
- " 7,\n",
512
- " 1,\n",
513
- " 0,\n",
514
- " 7,\n",
515
- " 0,\n",
516
- " 1,\n",
517
- " 0,\n",
518
- " 4,\n",
519
- " 1,\n",
520
- " 1,\n",
521
- " 0,\n",
522
- " 7,\n",
523
- " 1,\n",
524
- " 0,\n",
525
- " 7,\n",
526
- " 6,\n",
527
- " 2,\n",
528
- " 3,\n",
529
- " 7,\n",
530
- " 4,\n",
531
- " 3,\n",
532
- " 4,\n",
533
- " 3,\n",
534
- " 3,\n",
535
- " 2,\n",
536
- " 5,\n",
537
- " 1,\n",
538
- " 5,\n",
539
- " 1,\n",
540
- " 7,\n",
541
- " 3,\n",
542
- " 2,\n",
543
- " 6,\n",
544
- " 4,\n",
545
- " 4,\n",
546
- " 1,\n",
547
- " 2,\n",
548
- " 6,\n",
549
- " 7,\n",
550
- " 2,\n",
551
- " 7,\n",
552
- " 1,\n",
553
- " 3,\n",
554
- " 5,\n",
555
- " 2,\n",
556
- " 6,\n",
557
- " 4,\n",
558
- " 6,\n",
559
- " 7,\n",
560
- " 0,\n",
561
- " 5,\n",
562
- " 1,\n",
563
- " 6,\n",
564
- " 5,\n",
565
- " 3,\n",
566
- " 6,\n",
567
- " 5,\n",
568
- " 4,\n",
569
- " 7,\n",
570
- " 6,\n",
571
- " 5,\n",
572
- " 4,\n",
573
- " 3,\n",
574
- " 0,\n",
575
- " 0,\n",
576
- " 1,\n",
577
- " 7,\n",
578
- " 7,\n",
579
- " 6,\n",
580
- " 1,\n",
581
- " 4,\n",
582
- " 5,\n",
583
- " 6,\n",
584
- " 1,\n",
585
- " 5,\n",
586
- " 1,\n",
587
- " 2,\n",
588
- " 6,\n",
589
- " 2,\n",
590
- " 6,\n",
591
- " 0,\n",
592
- " 2,\n",
593
- " 1,\n",
594
- " 5,\n",
595
- " 5,\n",
596
- " 1,\n",
597
- " 7,\n",
598
- " 0,\n",
599
- " 5,\n",
600
- " 5,\n",
601
- " 1,\n",
602
- " 7,\n",
603
- " 7,\n",
604
- " 2,\n",
605
- " 1,\n",
606
- " 0,\n",
607
- " 1,\n",
608
- " 0,\n",
609
- " 5,\n",
610
- " 4,\n",
611
- " 2,\n",
612
- " 7,\n",
613
- " 4,\n",
614
- " 3,\n",
615
- " 6,\n",
616
- " 7,\n",
617
- " 5,\n",
618
- " 1,\n",
619
- " 0,\n",
620
- " 7,\n",
621
- " 2,\n",
622
- " 1,\n",
623
- " 2,\n",
624
- " 3,\n",
625
- " 1,\n",
626
- " 0,\n",
627
- " 3,\n",
628
- " 2,\n",
629
- " 6,\n",
630
- " 0,\n",
631
- " 5,\n",
632
- " 4,\n",
633
- " 7,\n",
634
- " 1,\n",
635
- " 1,\n",
636
- " 0,\n",
637
- " 7,\n",
638
- " 0,\n",
639
- " 6,\n",
640
- " 7,\n",
641
- " 6,\n",
642
- " 1,\n",
643
- " 5,\n",
644
- " 5,\n",
645
- " 7,\n",
646
- " 6,\n",
647
- " 1,\n",
648
- " 7,\n",
649
- " 6,\n",
650
- " 5,\n",
651
- " 4,\n",
652
- " 1,\n",
653
- " 4,\n",
654
- " 7,\n",
655
- " 5,\n",
656
- " 4,\n",
657
- " 0,\n",
658
- " 0,\n",
659
- " 7,\n",
660
- " 0,\n",
661
- " 0,\n",
662
- " 3,\n",
663
- " 6,\n",
664
- " 2,\n",
665
- " 5,\n",
666
- " 3,\n",
667
- " 0,\n",
668
- " 3,\n",
669
- " 6,\n",
670
- " 5,\n",
671
- " 7,\n",
672
- " 2,\n",
673
- " 6,\n",
674
- " 7,\n",
675
- " 5,\n",
676
- " 2,\n",
677
- " 3,\n",
678
- " 6,\n",
679
- " 7,\n",
680
- " 7,\n",
681
- " 7,\n",
682
- " 6,\n",
683
- " 1,\n",
684
- " 7,\n",
685
- " 4,\n",
686
- " 2,\n",
687
- " 7,\n",
688
- " 5,\n",
689
- " 4,\n",
690
- " 1,\n",
691
- " 2,\n",
692
- " 3,\n",
693
- " 7,\n",
694
- " 0,\n",
695
- " 2,\n",
696
- " 7,\n",
697
- " 6,\n",
698
- " 1,\n",
699
- " 4,\n",
700
- " 0,\n",
701
- " 6,\n",
702
- " 3,\n",
703
- " 1,\n",
704
- " 0,\n",
705
- " 3,\n",
706
- " 4,\n",
707
- " 7,\n",
708
- " 7,\n",
709
- " 4,\n",
710
- " 2,\n",
711
- " 1,\n",
712
- " 0,\n",
713
- " 5,\n",
714
- " 1,\n",
715
- " 7,\n",
716
- " 4,\n",
717
- " 6,\n",
718
- " 7,\n",
719
- " 7,\n",
720
- " 3,\n",
721
- " 4,\n",
722
- " 3,\n",
723
- " 5,\n",
724
- " 4,\n",
725
- " 4,\n",
726
- " 5,\n",
727
- " 0,\n",
728
- " 1,\n",
729
- " 3,\n",
730
- " 7,\n",
731
- " 5,\n",
732
- " 4,\n",
733
- " 7,\n",
734
- " 3,\n",
735
- " 3,\n",
736
- " 3,\n",
737
- " 5,\n",
738
- " 3,\n",
739
- " 3,\n",
740
- " 4,\n",
741
- " 0,\n",
742
- " 1,\n",
743
- " 7,\n",
744
- " 4,\n",
745
- " 7,\n",
746
- " 7,\n",
747
- " 5,\n",
748
- " 0,\n",
749
- " 0,\n",
750
- " 5,\n",
751
- " 2,\n",
752
- " 6,\n",
753
- " 2,\n",
754
- " 6,\n",
755
- " 7,\n",
756
- " 6,\n",
757
- " 5,\n",
758
- " 7,\n",
759
- " 5,\n",
760
- " 7,\n",
761
- " 1,\n",
762
- " 6,\n",
763
- " 6,\n",
764
- " 0,\n",
765
- " 4,\n",
766
- " 7,\n",
767
- " 3,\n",
768
- " 0,\n",
769
- " 0,\n",
770
- " 2,\n",
771
- " 5,\n",
772
- " 2,\n",
773
- " 3,\n",
774
- " 7,\n",
775
- " 1,\n",
776
- " 0,\n",
777
- " 3,\n",
778
- " 0,\n",
779
- " 0,\n",
780
- " 3,\n",
781
- " 3,\n",
782
- " 7,\n",
783
- " 3,\n",
784
- " 0,\n",
785
- " 1,\n",
786
- " 1,\n",
787
- " 6,\n",
788
- " 0,\n",
789
- " 0,\n",
790
- " 5,\n",
791
- " 0,\n",
792
- " 3,\n",
793
- " 4,\n",
794
- " 6,\n",
795
- " 7,\n",
796
- " 4,\n",
797
- " 0,\n",
798
- " 4,\n",
799
- " 4,\n",
800
- " 5,\n",
801
- " 4,\n",
802
- " 4,\n",
803
- " 3,\n",
804
- " 6,\n",
805
- " 5,\n",
806
- " 2,\n",
807
- " 0,\n",
808
- " 6,\n",
809
- " 0,\n",
810
- " 6,\n",
811
- " 4,\n",
812
- " 3,\n",
813
- " 5,\n",
814
- " 7,\n",
815
- " 7,\n",
816
- " 5,\n",
817
- " 5,\n",
818
- " 1,\n",
819
- " 5,\n",
820
- " 2,\n",
821
- " 7,\n",
822
- " 7,\n",
823
- " 6,\n",
824
- " 6,\n",
825
- " 7,\n",
826
- " 6,\n",
827
- " 5,\n",
828
- " 2,\n",
829
- " 4,\n",
830
- " 0,\n",
831
- " 4,\n",
832
- " 4,\n",
833
- " 7,\n",
834
- " 5,\n",
835
- " 2,\n",
836
- " 7,\n",
837
- " 0,\n",
838
- " 6,\n",
839
- " 0,\n",
840
- " 2,\n",
841
- " 6,\n",
842
- " 6,\n",
843
- " 2,\n",
844
- " 3,\n",
845
- " 0,\n",
846
- " 5,\n",
847
- " 0,\n",
848
- " 5,\n",
849
- " 7,\n",
850
- " 2,\n",
851
- " 7,\n",
852
- " 4,\n",
853
- " 7,\n",
854
- " 4,\n",
855
- " 0,\n",
856
- " 7,\n",
857
- " 1,\n",
858
- " 4,\n",
859
- " 5,\n",
860
- " 0,\n",
861
- " 5,\n",
862
- " 5,\n",
863
- " 2,\n",
864
- " 0,\n",
865
- " 2,\n",
866
- " 5,\n",
867
- " 5,\n",
868
- " 6,\n",
869
- " 3,\n",
870
- " 4,\n",
871
- " 1,\n",
872
- " 7,\n",
873
- " 7,\n",
874
- " 2,\n",
875
- " 3,\n",
876
- " 2,\n",
877
- " 5,\n",
878
- " 0,\n",
879
- " 7,\n",
880
- " 2,\n",
881
- " 3,\n",
882
- " 7,\n",
883
- " 2,\n",
884
- " 4,\n",
885
- " 0,\n",
886
- " 5,\n",
887
- " 7,\n",
888
- " 3,\n",
889
- " 6,\n",
890
- " 7,\n",
891
- " 6,\n",
892
- " 4,\n",
893
- " 3,\n",
894
- " 6,\n",
895
- " 5,\n",
896
- " 4,\n",
897
- " 0,\n",
898
- " 3,\n",
899
- " 4,\n",
900
- " 3,\n",
901
- " 5,\n",
902
- " 2,\n",
903
- " 4,\n",
904
- " 0,\n",
905
- " 3,\n",
906
- " 6,\n",
907
- " 1,\n",
908
- " 3,\n",
909
- " 1,\n",
910
- " 4,\n",
911
- " 3,\n",
912
- " 3,\n",
913
- " 3,\n",
914
- " 0,\n",
915
- " 7,\n",
916
- " 6,\n",
917
- " 2,\n",
918
- " 4,\n",
919
- " 6,\n",
920
- " 5,\n",
921
- " 4,\n",
922
- " 1,\n",
923
- " 7,\n",
924
- " 6,\n",
925
- " 1,\n",
926
- " 4,\n",
927
- " 3,\n",
928
- " 0,\n",
929
- " 7,\n",
930
- " 3,\n",
931
- " 1,\n",
932
- " 2,\n",
933
- " 1,\n",
934
- " 6,\n",
935
- " 4,\n",
936
- " 7,\n",
937
- " 1,\n",
938
- " 7,\n",
939
- " 1,\n",
940
- " 5,\n",
941
- " 1,\n",
942
- " 6,\n",
943
- " 3,\n",
944
- " 0,\n",
945
- " 2,\n",
946
- " 6,\n",
947
- " 7,\n",
948
- " 7,\n",
949
- " 0,\n",
950
- " 1,\n",
951
- " 4,\n",
952
- " 0,\n",
953
- " 4,\n",
954
- " 5,\n",
955
- " 3,\n",
956
- " 6,\n",
957
- " 2,\n",
958
- " 3,\n",
959
- " 4,\n",
960
- " 1,\n",
961
- " 6,\n",
962
- " 2,\n",
963
- " 4,\n",
964
- " 4,\n",
965
- " 6,\n",
966
- " 4,\n",
967
- " 5,\n",
968
- " 7,\n",
969
- " 1,\n",
970
- " 7,\n",
971
- " 7,\n",
972
- " 4,\n",
973
- " 7,\n",
974
- " 4,\n",
975
- " 3,\n",
976
- " 3,\n",
977
- " 6,\n",
978
- " 1,\n",
979
- " 2,\n",
980
- " 0,\n",
981
- " 0,\n",
982
- " 0,\n",
983
- " 2,\n",
984
- " 5,\n",
985
- " 6,\n",
986
- " 5,\n",
987
- " 7,\n",
988
- " 5,\n",
989
- " 7,\n",
990
- " 1,\n",
991
- " 1,\n",
992
- " 2,\n",
993
- " 1,\n",
994
- " 6,\n",
995
- " 5,\n",
996
- " 7,\n",
997
- " 0,\n",
998
- " 0,\n",
999
- " 5,\n",
1000
- " 5,\n",
1001
- " 0,\n",
1002
- " 3,\n",
1003
- " 7,\n",
1004
- " 5,\n",
1005
- " 2,\n",
1006
- " 5,\n",
1007
- " 4,\n",
1008
- " 2,\n",
1009
- " 3,\n",
1010
- " 6,\n",
1011
- " 2,\n",
1012
- " 3,\n",
1013
- " 6,\n",
1014
- " 0,\n",
1015
- " 0,\n",
1016
- " 2,\n",
1017
- " 6,\n",
1018
- " 0,\n",
1019
- " 1,\n",
1020
- " 3,\n",
1021
- " 3,\n",
1022
- " 6,\n",
1023
- " 4,\n",
1024
- " 6,\n",
1025
- " 4,\n",
1026
- " 6,\n",
1027
- " 0,\n",
1028
- " 0,\n",
1029
- " 2,\n",
1030
- " 3,\n",
1031
- " 6,\n",
1032
- " 2,\n",
1033
- " 2,\n",
1034
- " 6,\n",
1035
- " 6,\n",
1036
- " 2,\n",
1037
- " 4,\n",
1038
- " 3,\n",
1039
- " 3,\n",
1040
- " 6,\n",
1041
- " 7,\n",
1042
- " 7,\n",
1043
- " 1,\n",
1044
- " 1,\n",
1045
- " 7,\n",
1046
- " 7,\n",
1047
- " 6,\n",
1048
- " 1,\n",
1049
- " 7,\n",
1050
- " 0,\n",
1051
- " 0,\n",
1052
- " 2,\n",
1053
- " 4,\n",
1054
- " 2,\n",
1055
- " 2,\n",
1056
- " 3,\n",
1057
- " 0,\n",
1058
- " 1,\n",
1059
- " 4,\n",
1060
- " 0,\n",
1061
- " 4,\n",
1062
- " 6,\n",
1063
- " 5,\n",
1064
- " 3,\n",
1065
- " 2,\n",
1066
- " 3,\n",
1067
- " 2,\n",
1068
- " 3,\n",
1069
- " 6,\n",
1070
- " 2,\n",
1071
- " 1,\n",
1072
- " 4,\n",
1073
- " 7,\n",
1074
- " 6,\n",
1075
- " 4,\n",
1076
- " 5,\n",
1077
- " 6,\n",
1078
- " 7,\n",
1079
- " 7,\n",
1080
- " 2,\n",
1081
- " 0,\n",
1082
- " 5,\n",
1083
- " 5,\n",
1084
- " 0,\n",
1085
- " 3,\n",
1086
- " 6,\n",
1087
- " 6,\n",
1088
- " 5,\n",
1089
- " 4,\n",
1090
- " 4,\n",
1091
- " 7,\n",
1092
- " 0,\n",
1093
- " 5,\n",
1094
- " 1,\n",
1095
- " 7,\n",
1096
- " 0,\n",
1097
- " 3,\n",
1098
- " 1,\n",
1099
- " 7,\n",
1100
- " 0,\n",
1101
- " 1,\n",
1102
- " 4,\n",
1103
- " 7,\n",
1104
- " 5,\n",
1105
- " 0,\n",
1106
- " 4,\n",
1107
- " 0,\n",
1108
- " 0,\n",
1109
- " 1,\n",
1110
- " 0,\n",
1111
- " 6,\n",
1112
- " 4,\n",
1113
- " 0,\n",
1114
- " 5,\n",
1115
- " 4,\n",
1116
- " 6,\n",
1117
- " 6,\n",
1118
- " 7,\n",
1119
- " 2,\n",
1120
- " 6,\n",
1121
- " 2,\n",
1122
- " 6,\n",
1123
- " 0,\n",
1124
- " 3,\n",
1125
- " 2,\n",
1126
- " 2,\n",
1127
- " 1,\n",
1128
- " 5,\n",
1129
- " 4,\n",
1130
- " 7,\n",
1131
- " 6,\n",
1132
- " 6,\n",
1133
- " 2,\n",
1134
- " 5,\n",
1135
- " 5,\n",
1136
- " 5,\n",
1137
- " 0,\n",
1138
- " 3,\n",
1139
- " 5,\n",
1140
- " 4,\n",
1141
- " 5,\n",
1142
- " 7,\n",
1143
- " 5,\n",
1144
- " 0,\n",
1145
- " 5,\n",
1146
- " 0,\n",
1147
- " 0,\n",
1148
- " 2,\n",
1149
- " 0,\n",
1150
- " 2,\n",
1151
- " 1,\n",
1152
- " 0,\n",
1153
- " 2,\n",
1154
- " 4,\n",
1155
- " 3,\n",
1156
- " 4,\n",
1157
- " 1,\n",
1158
- " 7,\n",
1159
- " 2,\n",
1160
- " 1,\n",
1161
- " 0,\n",
1162
- " 3,\n",
1163
- " 0,\n",
1164
- " 3,\n",
1165
- " 1,\n",
1166
- " 1,\n",
1167
- " 0,\n",
1168
- " 5,\n",
1169
- " 3,\n",
1170
- " 1,\n",
1171
- " 2,\n",
1172
- " 5,\n",
1173
- " 6,\n",
1174
- " 7,\n",
1175
- " 6,\n",
1176
- " 7,\n",
1177
- " 0,\n",
1178
- " 2,\n",
1179
- " 6,\n",
1180
- " 3,\n",
1181
- " 1,\n",
1182
- " 5,\n",
1183
- " 4,\n",
1184
- " 2,\n",
1185
- " 4,\n",
1186
- " 6,\n",
1187
- " 5,\n",
1188
- " 2,\n",
1189
- " 7,\n",
1190
- " ...]"
1191
- ]
1192
- },
1193
- "execution_count": 6,
1194
- "metadata": {},
1195
- "output_type": "execute_result"
1196
- }
1197
- ],
1198
- "source": [
1199
- "\n",
1200
- "#--------------------------------------------------------------------------------------------\n",
1201
- "# YOUR MODEL INFERENCE CODE HERE\n",
1202
- "# Update the code below to replace the random baseline by your model inference within the inference pass where the energy consumption and emissions are tracked.\n",
1203
- "#-------------------------------------------------------------------------------------------- \n",
1204
- "\n",
1205
- "# Make random predictions (placeholder for actual model inference)\n",
1206
- "true_labels = test_dataset[\"label\"]\n",
1207
- "predictions = [random.randint(0, 7) for _ in range(len(true_labels))]\n",
1208
- "\n",
1209
- "predictions\n",
1210
- "\n",
1211
- "#--------------------------------------------------------------------------------------------\n",
1212
- "# YOUR MODEL INFERENCE STOPS HERE\n",
1213
- "#-------------------------------------------------------------------------------------------- "
1214
- ]
1215
- },
1216
- {
1217
- "cell_type": "code",
1218
- "execution_count": 8,
1219
- "metadata": {},
1220
- "outputs": [
1221
- {
1222
- "name": "stderr",
1223
- "output_type": "stream",
1224
- "text": [
1225
- "[codecarbon WARNING @ 19:53:32] Background scheduler didn't run for a long period (47s), results might be inaccurate\n",
1226
- "[codecarbon INFO @ 19:53:32] Energy consumed for RAM : 0.000156 kWh. RAM Power : 11.755242347717285 W\n",
1227
- "[codecarbon INFO @ 19:53:32] Delta energy consumed for CPU with constant : 0.000564 kWh, power : 42.5 W\n",
1228
- "[codecarbon INFO @ 19:53:32] Energy consumed for All CPU : 0.000564 kWh\n",
1229
- "[codecarbon INFO @ 19:53:32] 0.000720 kWh of electricity used since the beginning.\n"
1230
- ]
1231
- },
1232
- {
1233
- "data": {
1234
- "text/plain": [
1235
- "EmissionsData(timestamp='2025-01-21T19:53:32', project_name='codecarbon', run_id='908f2e7e-4bb2-4991-a0f6-56bf8d7eda21', experiment_id='5b0fa12a-3dd7-45bb-9766-cc326314d9f1', duration=47.736408500000834, emissions=4.032368007471064e-05, emissions_rate=8.444466886328872e-07, cpu_power=42.5, gpu_power=0.0, ram_power=11.755242347717285, cpu_energy=0.0005636615353475565, gpu_energy=0, ram_energy=0.00015590305493261682, energy_consumed=0.0007195645902801733, country_name='France', country_iso_code='FRA', region='île-de-france', cloud_provider='', cloud_region='', os='Windows-11-10.0.22631-SP0', python_version='3.12.7', codecarbon_version='3.0.0_rc0', cpu_count=12, cpu_model='13th Gen Intel(R) Core(TM) i7-1365U', gpu_count=None, gpu_model=None, longitude=2.3494, latitude=48.8558, ram_total_size=31.347312927246094, tracking_mode='machine', on_cloud='N', pue=1.0)"
1236
- ]
1237
- },
1238
- "execution_count": 8,
1239
- "metadata": {},
1240
- "output_type": "execute_result"
1241
- }
1242
- ],
1243
- "source": [
1244
- "# Stop tracking emissions\n",
1245
- "emissions_data = tracker.stop_task()\n",
1246
- "emissions_data"
1247
- ]
1248
- },
1249
- {
1250
- "cell_type": "code",
1251
- "execution_count": 9,
1252
- "metadata": {},
1253
- "outputs": [
1254
- {
1255
- "data": {
1256
- "text/plain": [
1257
- "0.10090237899917966"
1258
- ]
1259
- },
1260
- "execution_count": 9,
1261
- "metadata": {},
1262
- "output_type": "execute_result"
1263
- }
1264
- ],
1265
- "source": [
1266
- "# Calculate accuracy\n",
1267
- "accuracy = accuracy_score(true_labels, predictions)\n",
1268
- "accuracy"
1269
- ]
1270
- },
1271
- {
1272
- "cell_type": "code",
1273
- "execution_count": 10,
1274
- "metadata": {},
1275
- "outputs": [
1276
- {
1277
- "data": {
1278
- "text/plain": [
1279
- "{'submission_timestamp': '2025-01-21T19:53:46.639165',\n",
1280
- " 'accuracy': 0.10090237899917966,\n",
1281
- " 'energy_consumed_wh': 0.7195645902801733,\n",
1282
- " 'emissions_gco2eq': 0.040323680074710634,\n",
1283
- " 'emissions_data': {'run_id': '908f2e7e-4bb2-4991-a0f6-56bf8d7eda21',\n",
1284
- " 'duration': 47.736408500000834,\n",
1285
- " 'emissions': 4.032368007471064e-05,\n",
1286
- " 'emissions_rate': 8.444466886328872e-07,\n",
1287
- " 'cpu_power': 42.5,\n",
1288
- " 'gpu_power': 0.0,\n",
1289
- " 'ram_power': 11.755242347717285,\n",
1290
- " 'cpu_energy': 0.0005636615353475565,\n",
1291
- " 'gpu_energy': 0,\n",
1292
- " 'ram_energy': 0.00015590305493261682,\n",
1293
- " 'energy_consumed': 0.0007195645902801733,\n",
1294
- " 'country_name': 'France',\n",
1295
- " 'country_iso_code': 'FRA',\n",
1296
- " 'region': 'île-de-france',\n",
1297
- " 'cloud_provider': '',\n",
1298
- " 'cloud_region': '',\n",
1299
- " 'os': 'Windows-11-10.0.22631-SP0',\n",
1300
- " 'python_version': '3.12.7',\n",
1301
- " 'codecarbon_version': '3.0.0_rc0',\n",
1302
- " 'cpu_count': 12,\n",
1303
- " 'cpu_model': '13th Gen Intel(R) Core(TM) i7-1365U',\n",
1304
- " 'gpu_count': None,\n",
1305
- " 'gpu_model': None,\n",
1306
- " 'ram_total_size': 31.347312927246094,\n",
1307
- " 'tracking_mode': 'machine',\n",
1308
- " 'on_cloud': 'N',\n",
1309
- " 'pue': 1.0},\n",
1310
- " 'dataset_config': {'dataset_name': 'QuotaClimat/frugalaichallenge-text-train',\n",
1311
- " 'test_size': 0.2,\n",
1312
- " 'test_seed': 42}}"
1313
- ]
1314
- },
1315
- "execution_count": 10,
1316
- "metadata": {},
1317
- "output_type": "execute_result"
1318
- }
1319
- ],
1320
- "source": [
1321
- "# Prepare results dictionary\n",
1322
- "results = {\n",
1323
- " \"submission_timestamp\": datetime.now().isoformat(),\n",
1324
- " \"accuracy\": float(accuracy),\n",
1325
- " \"energy_consumed_wh\": emissions_data.energy_consumed * 1000,\n",
1326
- " \"emissions_gco2eq\": emissions_data.emissions * 1000,\n",
1327
- " \"emissions_data\": clean_emissions_data(emissions_data),\n",
1328
- " \"dataset_config\": {\n",
1329
- " \"dataset_name\": request.dataset_name,\n",
1330
- " \"test_size\": request.test_size,\n",
1331
- " \"test_seed\": request.test_seed\n",
1332
- " }\n",
1333
- "}\n",
1334
- "\n",
1335
- "results"
1336
- ]
1337
- },
1338
- {
1339
- "cell_type": "markdown",
1340
- "metadata": {},
1341
- "source": [
1342
- "## Development of the model"
1343
- ]
1344
- },
1345
- {
1346
- "cell_type": "code",
1347
- "execution_count": 11,
1348
- "metadata": {},
1349
- "outputs": [
1350
- {
1351
- "data": {
1352
- "application/vnd.jupyter.widget-view+json": {
1353
- "model_id": "90f50ab19698484489f36976745efad3",
1354
- "version_major": 2,
1355
- "version_minor": 0
1356
- },
1357
- "text/plain": [
1358
- "config.json: 0%| | 0.00/1.15k [00:00<?, ?B/s]"
1359
- ]
1360
- },
1361
- "metadata": {},
1362
- "output_type": "display_data"
1363
- },
1364
- {
1365
- "name": "stderr",
1366
- "output_type": "stream",
1367
- "text": [
1368
- "c:\\Users\\theo.alvesdacosta\\AppData\\Local\\anaconda3\\Lib\\site-packages\\huggingface_hub\\file_download.py:139: UserWarning: `huggingface_hub` cache-system uses symlinks by default to efficiently store duplicated files but your machine does not support them in C:\\Users\\theo.alvesdacosta\\.cache\\huggingface\\hub\\models--facebook--bart-large-mnli. Caching files will still work but in a degraded version that might require more space on your disk. This warning can be disabled by setting the `HF_HUB_DISABLE_SYMLINKS_WARNING` environment variable. For more details, see https://huggingface.co/docs/huggingface_hub/how-to-cache#limitations.\n",
1369
- "To support symlinks on Windows, you either need to activate Developer Mode or to run Python as an administrator. In order to activate developer mode, see this article: https://docs.microsoft.com/en-us/windows/apps/get-started/enable-your-device-for-development\n",
1370
- " warnings.warn(message)\n"
1371
- ]
1372
- },
1373
- {
1374
- "data": {
1375
- "application/vnd.jupyter.widget-view+json": {
1376
- "model_id": "6e3974d8ff284603821f7beca9bd353d",
1377
- "version_major": 2,
1378
- "version_minor": 0
1379
- },
1380
- "text/plain": [
1381
- "model.safetensors: 0%| | 0.00/1.63G [00:00<?, ?B/s]"
1382
- ]
1383
- },
1384
- "metadata": {},
1385
- "output_type": "display_data"
1386
- },
1387
- {
1388
- "data": {
1389
- "application/vnd.jupyter.widget-view+json": {
1390
- "model_id": "bc29cb379c644b00b1bdf61d5426d99d",
1391
- "version_major": 2,
1392
- "version_minor": 0
1393
- },
1394
- "text/plain": [
1395
- "tokenizer_config.json: 0%| | 0.00/26.0 [00:00<?, ?B/s]"
1396
- ]
1397
- },
1398
- "metadata": {},
1399
- "output_type": "display_data"
1400
- },
1401
- {
1402
- "data": {
1403
- "application/vnd.jupyter.widget-view+json": {
1404
- "model_id": "635503cf819747c9a83f22aa4f2f11db",
1405
- "version_major": 2,
1406
- "version_minor": 0
1407
- },
1408
- "text/plain": [
1409
- "vocab.json: 0%| | 0.00/899k [00:00<?, ?B/s]"
1410
- ]
1411
- },
1412
- "metadata": {},
1413
- "output_type": "display_data"
1414
- },
1415
- {
1416
- "data": {
1417
- "application/vnd.jupyter.widget-view+json": {
1418
- "model_id": "3a5f53e451e8483ca7c33f42245abd13",
1419
- "version_major": 2,
1420
- "version_minor": 0
1421
- },
1422
- "text/plain": [
1423
- "merges.txt: 0%| | 0.00/456k [00:00<?, ?B/s]"
1424
- ]
1425
- },
1426
- "metadata": {},
1427
- "output_type": "display_data"
1428
- },
1429
- {
1430
- "data": {
1431
- "application/vnd.jupyter.widget-view+json": {
1432
- "model_id": "84f922d1b68a4a0faa5e920d004efca0",
1433
- "version_major": 2,
1434
- "version_minor": 0
1435
- },
1436
- "text/plain": [
1437
- "tokenizer.json: 0%| | 0.00/1.36M [00:00<?, ?B/s]"
1438
- ]
1439
- },
1440
- "metadata": {},
1441
- "output_type": "display_data"
1442
- },
1443
- {
1444
- "name": "stderr",
1445
- "output_type": "stream",
1446
- "text": [
1447
- "Device set to use cpu\n"
1448
- ]
1449
- }
1450
- ],
1451
- "source": [
1452
- "from transformers import pipeline\n",
1453
- "classifier = pipeline(\"zero-shot-classification\",\n",
1454
- " model=\"facebook/bart-large-mnli\")\n"
1455
- ]
1456
- },
1457
- {
1458
- "cell_type": "code",
1459
- "execution_count": 14,
1460
- "metadata": {},
1461
- "outputs": [],
1462
- "source": [
1463
- "sequence_to_classify = \"one day I will see the world\"\n",
1464
- "\n",
1465
- "candidate_labels = [\n",
1466
- " \"Not related to climate change disinformation\",\n",
1467
- " \"Climate change is not real and not happening\",\n",
1468
- " \"Climate change is not human-induced\",\n",
1469
- " \"Climate change impacts are not that bad\",\n",
1470
- " \"Climate change solutions are harmful and unnecessary\",\n",
1471
- " \"Climate change science is unreliable\",\n",
1472
- " \"Climate change proponents are biased\",\n",
1473
- " \"Fossil fuels are needed to address climate change\"\n",
1474
- "]"
1475
- ]
1476
- },
1477
- {
1478
- "cell_type": "code",
1479
- "execution_count": 15,
1480
- "metadata": {},
1481
- "outputs": [
1482
- {
1483
- "data": {
1484
- "text/plain": [
1485
- "{'sequence': 'one day I will see the world',\n",
1486
- " 'labels': ['Fossil fuels are needed to address climate change',\n",
1487
- " 'Climate change science is unreliable',\n",
1488
- " 'Not related to climate change disinformation',\n",
1489
- " 'Climate change proponents are biased',\n",
1490
- " 'Climate change impacts are not that bad',\n",
1491
- " 'Climate change solutions are harmful and unnecessary',\n",
1492
- " 'Climate change is not human-induced',\n",
1493
- " 'Climate change is not real and not happening'],\n",
1494
- " 'scores': [0.16242119669914246,\n",
1495
- " 0.15683825314044952,\n",
1496
- " 0.1564282774925232,\n",
1497
- " 0.14603719115257263,\n",
1498
- " 0.12794046103954315,\n",
1499
- " 0.10180754214525223,\n",
1500
- " 0.0936085507273674,\n",
1501
- " 0.0549185685813427]}"
1502
- ]
1503
- },
1504
- "execution_count": 15,
1505
- "metadata": {},
1506
- "output_type": "execute_result"
1507
- }
1508
- ],
1509
- "source": [
1510
- "classifier(sequence_to_classify, candidate_labels)"
1511
- ]
1512
- },
1513
- {
1514
- "cell_type": "code",
1515
- "execution_count": 26,
1516
- "metadata": {},
1517
- "outputs": [
1518
- {
1519
- "name": "stderr",
1520
- "output_type": "stream",
1521
- "text": [
1522
- "[codecarbon WARNING @ 11:00:07] Already started tracking\n"
1523
- ]
1524
- },
1525
- {
1526
- "data": {
1527
- "application/vnd.jupyter.widget-view+json": {
1528
- "model_id": "5d66a13f76a4411d95b62d4a73012495",
1529
- "version_major": 2,
1530
- "version_minor": 0
1531
- },
1532
- "text/plain": [
1533
- "0it [00:00, ?it/s]"
1534
- ]
1535
- },
1536
- "metadata": {},
1537
- "output_type": "display_data"
1538
- },
1539
- {
1540
- "name": "stderr",
1541
- "output_type": "stream",
1542
- "text": [
1543
- "[codecarbon WARNING @ 11:05:57] Background scheduler didn't run for a long period (349s), results might be inaccurate\n",
1544
- "[codecarbon INFO @ 11:05:57] Energy consumed for RAM : 0.018069 kWh. RAM Power : 11.755242347717285 W\n",
1545
- "[codecarbon INFO @ 11:05:57] Delta energy consumed for CPU with constant : 0.004122 kWh, power : 42.5 W\n",
1546
- "[codecarbon INFO @ 11:05:57] Energy consumed for All CPU : 0.065327 kWh\n",
1547
- "[codecarbon INFO @ 11:05:57] 0.083395 kWh of electricity used since the beginning.\n"
1548
- ]
1549
- },
1550
- {
1551
- "data": {
1552
- "text/plain": [
1553
- "EmissionsData(timestamp='2025-01-22T11:05:57', project_name='codecarbon', run_id='908f2e7e-4bb2-4991-a0f6-56bf8d7eda21', experiment_id='5b0fa12a-3dd7-45bb-9766-cc326314d9f1', duration=349.19709450000664, emissions=0.0002949120266226386, emissions_rate=8.445461750018632e-07, cpu_power=42.5, gpu_power=0.0, ram_power=11.755242347717285, cpu_energy=0.004122396676597424, gpu_energy=0, ram_energy=0.0011402244733631148, energy_consumed=0.005262621149960539, country_name='France', country_iso_code='FRA', region='île-de-france', cloud_provider='', cloud_region='', os='Windows-11-10.0.22631-SP0', python_version='3.12.7', codecarbon_version='3.0.0_rc0', cpu_count=12, cpu_model='13th Gen Intel(R) Core(TM) i7-1365U', gpu_count=None, gpu_model=None, longitude=2.3494, latitude=48.8558, ram_total_size=31.347312927246094, tracking_mode='machine', on_cloud='N', pue=1.0)"
1554
- ]
1555
- },
1556
- "execution_count": 26,
1557
- "metadata": {},
1558
- "output_type": "execute_result"
1559
- }
1560
- ],
1561
- "source": [
1562
- "# Start tracking emissions\n",
1563
- "tracker.start()\n",
1564
- "tracker.start_task(\"inference\")\n",
1565
- "\n",
1566
- "from tqdm.auto import tqdm\n",
1567
- "predictions = []\n",
1568
- "\n",
1569
- "\n",
1570
- "\n",
1571
- "# Option 1: Simple loop approach\n",
1572
- "\n",
1573
- "for i, text in tqdm(enumerate(test_dataset[\"quote\"])):\n",
1574
- "\n",
1575
- " result = classifier(text, candidate_labels)\n",
1576
- "\n",
1577
- " # Get index of highest scoring label\n",
1578
- "\n",
1579
- " pred_label = candidate_labels.index(result[\"labels\"][0])\n",
1580
- "\n",
1581
- " predictions.append(pred_label)\n",
1582
- " if i == 100:\n",
1583
- " break\n",
1584
- "\n",
1585
- "\n",
1586
- "# Stop tracking emissions\n",
1587
- "emissions_data = tracker.stop_task()\n",
1588
- "emissions_data\n"
1589
- ]
1590
- },
1591
- {
1592
- "cell_type": "code",
1593
- "execution_count": 28,
1594
- "metadata": {},
1595
- "outputs": [
1596
- {
1597
- "data": {
1598
- "text/plain": [
1599
- "0.4"
1600
- ]
1601
- },
1602
- "execution_count": 28,
1603
- "metadata": {},
1604
- "output_type": "execute_result"
1605
- }
1606
- ],
1607
- "source": [
1608
- "# Calculate accuracy\n",
1609
- "accuracy = accuracy_score(true_labels[:100], predictions[:100])\n",
1610
- "accuracy"
1611
- ]
1612
- },
1613
- {
1614
- "cell_type": "code",
1615
- "execution_count": null,
1616
- "metadata": {},
1617
- "outputs": [],
1618
- "source": []
1619
- }
1620
- ],
1621
- "metadata": {
1622
- "kernelspec": {
1623
- "display_name": "base",
1624
- "language": "python",
1625
- "name": "python3"
1626
- },
1627
- "language_info": {
1628
- "codemirror_mode": {
1629
- "name": "ipython",
1630
- "version": 3
1631
- },
1632
- "file_extension": ".py",
1633
- "mimetype": "text/x-python",
1634
- "name": "python",
1635
- "nbconvert_exporter": "python",
1636
- "pygments_lexer": "ipython3",
1637
- "version": "3.12.7"
1638
- }
1639
- },
1640
- "nbformat": 4,
1641
- "nbformat_minor": 2
1642
- }
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
tasks/image.py DELETED
@@ -1,176 +0,0 @@
1
- from fastapi import APIRouter
2
- from datetime import datetime
3
- from datasets import load_dataset
4
- import numpy as np
5
- from sklearn.metrics import accuracy_score, precision_score, recall_score
6
- import random
7
- import os
8
-
9
- from .utils.evaluation import ImageEvaluationRequest
10
- from .utils.emissions import tracker, clean_emissions_data, get_space_info
11
-
12
- from dotenv import load_dotenv
13
- load_dotenv()
14
-
15
- router = APIRouter()
16
-
17
- DESCRIPTION = "Random Baseline"
18
- ROUTE = "/image"
19
-
20
- def parse_boxes(annotation_string):
21
- """Parse multiple boxes from a single annotation string.
22
- Each box has 5 values: class_id, x_center, y_center, width, height"""
23
- values = [float(x) for x in annotation_string.strip().split()]
24
- boxes = []
25
- # Each box has 5 values
26
- for i in range(0, len(values), 5):
27
- if i + 5 <= len(values):
28
- # Skip class_id (first value) and take the next 4 values
29
- box = values[i+1:i+5]
30
- boxes.append(box)
31
- return boxes
32
-
33
- def compute_iou(box1, box2):
34
- """Compute Intersection over Union (IoU) between two YOLO format boxes."""
35
- # Convert YOLO format (x_center, y_center, width, height) to corners
36
- def yolo_to_corners(box):
37
- x_center, y_center, width, height = box
38
- x1 = x_center - width/2
39
- y1 = y_center - height/2
40
- x2 = x_center + width/2
41
- y2 = y_center + height/2
42
- return np.array([x1, y1, x2, y2])
43
-
44
- box1_corners = yolo_to_corners(box1)
45
- box2_corners = yolo_to_corners(box2)
46
-
47
- # Calculate intersection
48
- x1 = max(box1_corners[0], box2_corners[0])
49
- y1 = max(box1_corners[1], box2_corners[1])
50
- x2 = min(box1_corners[2], box2_corners[2])
51
- y2 = min(box1_corners[3], box2_corners[3])
52
-
53
- intersection = max(0, x2 - x1) * max(0, y2 - y1)
54
-
55
- # Calculate union
56
- box1_area = (box1_corners[2] - box1_corners[0]) * (box1_corners[3] - box1_corners[1])
57
- box2_area = (box2_corners[2] - box2_corners[0]) * (box2_corners[3] - box2_corners[1])
58
- union = box1_area + box2_area - intersection
59
-
60
- return intersection / (union + 1e-6)
61
-
62
- def compute_max_iou(true_boxes, pred_box):
63
- """Compute maximum IoU between a predicted box and all true boxes"""
64
- max_iou = 0
65
- for true_box in true_boxes:
66
- iou = compute_iou(true_box, pred_box)
67
- max_iou = max(max_iou, iou)
68
- return max_iou
69
-
70
- @router.post(ROUTE, tags=["Image Task"],
71
- description=DESCRIPTION)
72
- async def evaluate_image(request: ImageEvaluationRequest):
73
- """
74
- Evaluate image classification and object detection for forest fire smoke.
75
-
76
- Current Model: Random Baseline
77
- - Makes random predictions for both classification and bounding boxes
78
- - Used as a baseline for comparison
79
-
80
- Metrics:
81
- - Classification accuracy: Whether an image contains smoke or not
82
- - Object Detection accuracy: IoU (Intersection over Union) for smoke bounding boxes
83
- """
84
- # Get space info
85
- username, space_url = get_space_info()
86
-
87
- # Load and prepare the dataset
88
- dataset = load_dataset(request.dataset_name, token=os.getenv("HF_TOKEN"))
89
-
90
- # Split dataset
91
- train_test = dataset["train"].train_test_split(test_size=request.test_size, seed=request.test_seed)
92
- test_dataset = train_test["test"]
93
-
94
- # Start tracking emissions
95
- tracker.start()
96
- tracker.start_task("inference")
97
-
98
- #--------------------------------------------------------------------------------------------
99
- # YOUR MODEL INFERENCE CODE HERE
100
- # Update the code below to replace the random baseline with your model inference
101
- #--------------------------------------------------------------------------------------------
102
-
103
- predictions = []
104
- true_labels = []
105
- pred_boxes = []
106
- true_boxes_list = [] # List of lists, each inner list contains boxes for one image
107
-
108
- for example in test_dataset:
109
- # Parse true annotation (YOLO format: class_id x_center y_center width height)
110
- annotation = example.get("annotations", "").strip()
111
- has_smoke = len(annotation) > 0
112
- true_labels.append(int(has_smoke))
113
-
114
- # Make random classification prediction
115
- pred_has_smoke = random.random() > 0.5
116
- predictions.append(int(pred_has_smoke))
117
-
118
- # If there's a true box, parse it and make random box prediction
119
- if has_smoke:
120
- # Parse all true boxes from the annotation
121
- image_true_boxes = parse_boxes(annotation)
122
- true_boxes_list.append(image_true_boxes)
123
-
124
- # For baseline, make one random box prediction per image
125
- # In a real model, you might want to predict multiple boxes
126
- random_box = [
127
- random.random(), # x_center
128
- random.random(), # y_center
129
- random.random() * 0.5, # width (max 0.5)
130
- random.random() * 0.5 # height (max 0.5)
131
- ]
132
- pred_boxes.append(random_box)
133
-
134
- #--------------------------------------------------------------------------------------------
135
- # YOUR MODEL INFERENCE STOPS HERE
136
- #--------------------------------------------------------------------------------------------
137
-
138
- # Stop tracking emissions
139
- emissions_data = tracker.stop_task()
140
-
141
- # Calculate classification metrics
142
- classification_accuracy = accuracy_score(true_labels, predictions)
143
- classification_precision = precision_score(true_labels, predictions)
144
- classification_recall = recall_score(true_labels, predictions)
145
-
146
- # Calculate mean IoU for object detection (only for images with smoke)
147
- # For each image, we compute the max IoU between the predicted box and all true boxes
148
- ious = []
149
- for true_boxes, pred_box in zip(true_boxes_list, pred_boxes):
150
- max_iou = compute_max_iou(true_boxes, pred_box)
151
- ious.append(max_iou)
152
-
153
- mean_iou = float(np.mean(ious)) if ious else 0.0
154
-
155
- # Prepare results dictionary
156
- results = {
157
- "username": username,
158
- "space_url": space_url,
159
- "submission_timestamp": datetime.now().isoformat(),
160
- "model_description": DESCRIPTION,
161
- "classification_accuracy": float(classification_accuracy),
162
- "classification_precision": float(classification_precision),
163
- "classification_recall": float(classification_recall),
164
- "mean_iou": mean_iou,
165
- "energy_consumed_wh": emissions_data.energy_consumed * 1000,
166
- "emissions_gco2eq": emissions_data.emissions * 1000,
167
- "emissions_data": clean_emissions_data(emissions_data),
168
- "api_route": ROUTE,
169
- "dataset_config": {
170
- "dataset_name": request.dataset_name,
171
- "test_size": request.test_size,
172
- "test_seed": request.test_seed
173
- }
174
- }
175
-
176
- return results
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
tasks/text.py DELETED
@@ -1,92 +0,0 @@
1
- from fastapi import APIRouter
2
- from datetime import datetime
3
- from datasets import load_dataset
4
- from sklearn.metrics import accuracy_score
5
- import random
6
-
7
- from .utils.evaluation import TextEvaluationRequest
8
- from .utils.emissions import tracker, clean_emissions_data, get_space_info
9
-
10
- router = APIRouter()
11
-
12
- DESCRIPTION = "Random Baseline"
13
- ROUTE = "/text"
14
-
15
- @router.post(ROUTE, tags=["Text Task"],
16
- description=DESCRIPTION)
17
- async def evaluate_text(request: TextEvaluationRequest):
18
- """
19
- Evaluate text classification for climate disinformation detection.
20
-
21
- Current Model: Random Baseline
22
- - Makes random predictions from the label space (0-7)
23
- - Used as a baseline for comparison
24
- """
25
- # Get space info
26
- username, space_url = get_space_info()
27
-
28
- # Define the label mapping
29
- LABEL_MAPPING = {
30
- "0_not_relevant": 0,
31
- "1_not_happening": 1,
32
- "2_not_human": 2,
33
- "3_not_bad": 3,
34
- "4_solutions_harmful_unnecessary": 4,
35
- "5_science_unreliable": 5,
36
- "6_proponents_biased": 6,
37
- "7_fossil_fuels_needed": 7
38
- }
39
-
40
- # Load and prepare the dataset
41
- dataset = load_dataset(request.dataset_name)
42
-
43
- # Convert string labels to integers
44
- dataset = dataset.map(lambda x: {"label": LABEL_MAPPING[x["label"]]})
45
-
46
- # Split dataset
47
- train_test = dataset["train"].train_test_split(test_size=request.test_size, seed=request.test_seed)
48
- test_dataset = train_test["test"]
49
-
50
- # Start tracking emissions
51
- tracker.start()
52
- tracker.start_task("inference")
53
-
54
- #--------------------------------------------------------------------------------------------
55
- # YOUR MODEL INFERENCE CODE HERE
56
- # Update the code below to replace the random baseline by your model inference within the inference pass where the energy consumption and emissions are tracked.
57
- #--------------------------------------------------------------------------------------------
58
-
59
- # Make random predictions (placeholder for actual model inference)
60
- true_labels = test_dataset["label"]
61
- predictions = [random.randint(0, 7) for _ in range(len(true_labels))]
62
-
63
- #--------------------------------------------------------------------------------------------
64
- # YOUR MODEL INFERENCE STOPS HERE
65
- #--------------------------------------------------------------------------------------------
66
-
67
-
68
- # Stop tracking emissions
69
- emissions_data = tracker.stop_task()
70
-
71
- # Calculate accuracy
72
- accuracy = accuracy_score(true_labels, predictions)
73
-
74
- # Prepare results dictionary
75
- results = {
76
- "username": username,
77
- "space_url": space_url,
78
- "submission_timestamp": datetime.now().isoformat(),
79
- "model_description": DESCRIPTION,
80
- "accuracy": float(accuracy),
81
- "energy_consumed_wh": emissions_data.energy_consumed * 1000,
82
- "emissions_gco2eq": emissions_data.emissions * 1000,
83
- "emissions_data": clean_emissions_data(emissions_data),
84
- "api_route": ROUTE,
85
- "dataset_config": {
86
- "dataset_name": request.dataset_name,
87
- "test_size": request.test_size,
88
- "test_seed": request.test_seed
89
- }
90
- }
91
-
92
- return results