Spaces:
Runtime error
Runtime error
JP-SystemsX
commited on
Commit
·
318c91b
1
Parent(s):
163dff6
Updated README
Browse files
README.md
CHANGED
@@ -1,13 +1,114 @@
|
|
1 |
---
|
2 |
-
title:
|
3 |
emoji: 👁
|
4 |
-
colorFrom:
|
5 |
colorTo: red
|
6 |
sdk: gradio
|
7 |
sdk_version: 3.9.1
|
8 |
app_file: app.py
|
9 |
pinned: false
|
10 |
license: mit
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
11 |
---
|
12 |
|
13 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
+
title: nDCG
|
3 |
emoji: 👁
|
4 |
+
colorFrom: orange
|
5 |
colorTo: red
|
6 |
sdk: gradio
|
7 |
sdk_version: 3.9.1
|
8 |
app_file: app.py
|
9 |
pinned: false
|
10 |
license: mit
|
11 |
+
tags:
|
12 |
+
- evaluate
|
13 |
+
- metric
|
14 |
+
-
|
15 |
+
description: >-
|
16 |
+
The Discounted Cumulative Gain is a measure of ranking quality.
|
17 |
+
It is used to evaluate Information Retrieval Systems under the following 2 assumptions:
|
18 |
+
1. Highly relevant documents/Labels are more useful when appearing earlier in the results
|
19 |
+
2. Documents/Labels are relevant to different degrees
|
20 |
+
|
21 |
+
It is defined as the Sum over all relevances of the retrieved documents reduced logarithmically proportional to
|
22 |
+
the position in which they were retrieved.
|
23 |
+
The Normalized DCG (nDCG) divides the resulting value by the optimal value, that can be achieved, to get a value between
|
24 |
+
0 and 1 s.t. a perfect retrieval achieves a nDCG of 1.
|
25 |
---
|
26 |
|
27 |
+
# Metric Card for nDCG
|
28 |
+
|
29 |
+
## Metric Description
|
30 |
+
The Discounted Cumulative Gain is a measure of ranking quality.
|
31 |
+
It is used to evaluate Information Retrieval Systems under the 2 assumptions:
|
32 |
+
1. Highly relevant documents/Labels are more useful when appearing earlier in the results
|
33 |
+
2. Documents/Labels are relevant to different degrees
|
34 |
+
|
35 |
+
It is defined as the sum over all relevances of the retrieved documents reduced logarithmically proportional to
|
36 |
+
the position in which they were retrieved.
|
37 |
+
The Normalized DCG (nDCG) divides the resulting value by the optimal value that can be achieved to get a value between
|
38 |
+
0 and 1 s.t. a perfect retrieval achieves a nDCG of 1.0
|
39 |
+
|
40 |
+
## How to Use
|
41 |
+
|
42 |
+
At minimum, this metric takes as input two `list`s of `list`s, each containing `float`s: predictions and references.
|
43 |
+
|
44 |
+
```python
|
45 |
+
import evaluate
|
46 |
+
nDCG_metric = evaluate.load('JP-SystemsX/nDCG')
|
47 |
+
results = nDCG_metric.compute(references=[[0, 1]], predictions=[[0, 1]])
|
48 |
+
print(results)
|
49 |
+
["{'nDCG@2': 1.0}"]
|
50 |
+
```
|
51 |
+
|
52 |
+
### Inputs:
|
53 |
+
**references** ('list' of 'float'): True relevance
|
54 |
+
|
55 |
+
**predictions** ('list' of 'float'): Either predicted relevance, probability estimates or confidence values
|
56 |
+
|
57 |
+
**k** (int): If set to a value only the k highest scores in the ranking will be considered, else considers all outputs.
|
58 |
+
Defaults to None.
|
59 |
+
|
60 |
+
**sample_weight** (`list` of `float`): Sample weights Defaults to None.
|
61 |
+
|
62 |
+
**ignore_ties** ('boolean'): If set to true, assumes that there are no ties (this is likely if predictions are continuous)
|
63 |
+
for efficiency gains. Defaults to False.
|
64 |
+
|
65 |
+
### Output:
|
66 |
+
**normalized_discounted_cumulative_gain** ('float'): The averaged nDCG scores for all samples.
|
67 |
+
Minimum possible value is 0.0 Maximum possible value is 1.0
|
68 |
+
|
69 |
+
Output Example(s):
|
70 |
+
```python
|
71 |
+
{'nDCG@5': 1.0}
|
72 |
+
```
|
73 |
+
This metric outputs a dictionary, containing the nDCG score
|
74 |
+
|
75 |
+
|
76 |
+
### Examples:
|
77 |
+
Example 1-A simple example
|
78 |
+
>>> nDCG_metric = evaluate.load("JP-SystemsX/nDCG")
|
79 |
+
>>> results = nDCG_metric.compute(references=[[10, 0, 0, 1, 5]], predictions=[[.1, .2, .3, 4, 70]])
|
80 |
+
>>> print(results)
|
81 |
+
{'nDCG': 0.6956940443813076}
|
82 |
+
Example 2-The same as Example 1, except with k set to 3.
|
83 |
+
>>> nDCG_metric = evaluate.load("JP-SystemsX/nDCG")
|
84 |
+
>>> results = nDCG_metric.compute(references=[[10, 0, 0, 1, 5]], predictions=[[.1, .2, .3, 4, 70]], k=3)
|
85 |
+
>>> print(results)
|
86 |
+
{'nDCG@3': 0.4123818817534531}
|
87 |
+
Example 3-There is only one relevant label, but there is a tie and the model can't decide which one is the one.
|
88 |
+
>>> accuracy_metric = evaluate.load("accuracy")
|
89 |
+
>>> results = nDCG_metric.compute(references=[[1, 0, 0, 0, 0]], predictions=[[1, 1, 0, 0, 0]], k=1)
|
90 |
+
>>> print(results)
|
91 |
+
{'nDCG@1': 0.5}
|
92 |
+
>>> #That is it calculates both and returns the average of both
|
93 |
+
Example 4-The Same as 3, except ignore_ties is set to True.
|
94 |
+
>>> accuracy_metric = evaluate.load("accuracy")
|
95 |
+
>>> results = nDCG_metric.compute(references=[[1, 0, 0, 0, 0]], predictions=[[1, 1, 0, 0, 0]], k=1, ignore_ties=True)
|
96 |
+
>>> print(results)
|
97 |
+
{'nDCG@1': 0.0}
|
98 |
+
>>> # Alternative Result: {'nDCG@1': 1.0}
|
99 |
+
>>> # That is it chooses one of the 2 candidates and calculates the score only for this one
|
100 |
+
>>> # That means the score may vary depending on which one was chosen
|
101 |
+
|
102 |
+
## Citation(s)
|
103 |
+
```bibtex
|
104 |
+
@article{scikit-learn,
|
105 |
+
title={Scikit-learn: Machine Learning in {P}ython},
|
106 |
+
author={Pedregosa, F. and Varoquaux, G. and Gramfort, A. and Michel, V.
|
107 |
+
and Thirion, B. and Grisel, O. and Blondel, M. and Prettenhofer, P.
|
108 |
+
and Weiss, R. and Dubourg, V. and Vanderplas, J. and Passos, A. and
|
109 |
+
Cournapeau, D. and Brucher, M. and Perrot, M. and Duchesnay, E.},
|
110 |
+
journal={Journal of Machine Learning Research},
|
111 |
+
volume={12},
|
112 |
+
pages={2825--2830},
|
113 |
+
year={2011}
|
114 |
+
}
|
nDCG.py
CHANGED
@@ -46,7 +46,7 @@ Args:
|
|
46 |
|
47 |
sample_weight (`list` of `float`): Sample weights Defaults to None.
|
48 |
|
49 |
-
ignore_ties ('boolean'): If set to true
|
50 |
for efficiency gains. Defaults to False.
|
51 |
|
52 |
Returns:
|
|
|
46 |
|
47 |
sample_weight (`list` of `float`): Sample weights Defaults to None.
|
48 |
|
49 |
+
ignore_ties ('boolean'): If set to true assumes that there are no ties (this is likely if predictions are continuous)
|
50 |
for efficiency gains. Defaults to False.
|
51 |
|
52 |
Returns:
|