JP-SystemsX commited on
Commit
318c91b
·
1 Parent(s): 163dff6

Updated README

Browse files
Files changed (2) hide show
  1. README.md +104 -3
  2. nDCG.py +1 -1
README.md CHANGED
@@ -1,13 +1,114 @@
1
  ---
2
- title: NDCG
3
  emoji: 👁
4
- colorFrom: purple
5
  colorTo: red
6
  sdk: gradio
7
  sdk_version: 3.9.1
8
  app_file: app.py
9
  pinned: false
10
  license: mit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
11
  ---
12
 
13
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ title: nDCG
3
  emoji: 👁
4
+ colorFrom: orange
5
  colorTo: red
6
  sdk: gradio
7
  sdk_version: 3.9.1
8
  app_file: app.py
9
  pinned: false
10
  license: mit
11
+ tags:
12
+ - evaluate
13
+ - metric
14
+ -
15
+ description: >-
16
+ The Discounted Cumulative Gain is a measure of ranking quality.
17
+ It is used to evaluate Information Retrieval Systems under the following 2 assumptions:
18
+ 1. Highly relevant documents/Labels are more useful when appearing earlier in the results
19
+ 2. Documents/Labels are relevant to different degrees
20
+
21
+ It is defined as the Sum over all relevances of the retrieved documents reduced logarithmically proportional to
22
+ the position in which they were retrieved.
23
+ The Normalized DCG (nDCG) divides the resulting value by the optimal value, that can be achieved, to get a value between
24
+ 0 and 1 s.t. a perfect retrieval achieves a nDCG of 1.
25
  ---
26
 
27
+ # Metric Card for nDCG
28
+
29
+ ## Metric Description
30
+ The Discounted Cumulative Gain is a measure of ranking quality.
31
+ It is used to evaluate Information Retrieval Systems under the 2 assumptions:
32
+ 1. Highly relevant documents/Labels are more useful when appearing earlier in the results
33
+ 2. Documents/Labels are relevant to different degrees
34
+
35
+ It is defined as the sum over all relevances of the retrieved documents reduced logarithmically proportional to
36
+ the position in which they were retrieved.
37
+ The Normalized DCG (nDCG) divides the resulting value by the optimal value that can be achieved to get a value between
38
+ 0 and 1 s.t. a perfect retrieval achieves a nDCG of 1.0
39
+
40
+ ## How to Use
41
+
42
+ At minimum, this metric takes as input two `list`s of `list`s, each containing `float`s: predictions and references.
43
+
44
+ ```python
45
+ import evaluate
46
+ nDCG_metric = evaluate.load('JP-SystemsX/nDCG')
47
+ results = nDCG_metric.compute(references=[[0, 1]], predictions=[[0, 1]])
48
+ print(results)
49
+ ["{'nDCG@2': 1.0}"]
50
+ ```
51
+
52
+ ### Inputs:
53
+ **references** ('list' of 'float'): True relevance
54
+
55
+ **predictions** ('list' of 'float'): Either predicted relevance, probability estimates or confidence values
56
+
57
+ **k** (int): If set to a value only the k highest scores in the ranking will be considered, else considers all outputs.
58
+ Defaults to None.
59
+
60
+ **sample_weight** (`list` of `float`): Sample weights Defaults to None.
61
+
62
+ **ignore_ties** ('boolean'): If set to true, assumes that there are no ties (this is likely if predictions are continuous)
63
+ for efficiency gains. Defaults to False.
64
+
65
+ ### Output:
66
+ **normalized_discounted_cumulative_gain** ('float'): The averaged nDCG scores for all samples.
67
+ Minimum possible value is 0.0 Maximum possible value is 1.0
68
+
69
+ Output Example(s):
70
+ ```python
71
+ {'nDCG@5': 1.0}
72
+ ```
73
+ This metric outputs a dictionary, containing the nDCG score
74
+
75
+
76
+ ### Examples:
77
+ Example 1-A simple example
78
+ >>> nDCG_metric = evaluate.load("JP-SystemsX/nDCG")
79
+ >>> results = nDCG_metric.compute(references=[[10, 0, 0, 1, 5]], predictions=[[.1, .2, .3, 4, 70]])
80
+ >>> print(results)
81
+ {'nDCG': 0.6956940443813076}
82
+ Example 2-The same as Example 1, except with k set to 3.
83
+ >>> nDCG_metric = evaluate.load("JP-SystemsX/nDCG")
84
+ >>> results = nDCG_metric.compute(references=[[10, 0, 0, 1, 5]], predictions=[[.1, .2, .3, 4, 70]], k=3)
85
+ >>> print(results)
86
+ {'nDCG@3': 0.4123818817534531}
87
+ Example 3-There is only one relevant label, but there is a tie and the model can't decide which one is the one.
88
+ >>> accuracy_metric = evaluate.load("accuracy")
89
+ >>> results = nDCG_metric.compute(references=[[1, 0, 0, 0, 0]], predictions=[[1, 1, 0, 0, 0]], k=1)
90
+ >>> print(results)
91
+ {'nDCG@1': 0.5}
92
+ >>> #That is it calculates both and returns the average of both
93
+ Example 4-The Same as 3, except ignore_ties is set to True.
94
+ >>> accuracy_metric = evaluate.load("accuracy")
95
+ >>> results = nDCG_metric.compute(references=[[1, 0, 0, 0, 0]], predictions=[[1, 1, 0, 0, 0]], k=1, ignore_ties=True)
96
+ >>> print(results)
97
+ {'nDCG@1': 0.0}
98
+ >>> # Alternative Result: {'nDCG@1': 1.0}
99
+ >>> # That is it chooses one of the 2 candidates and calculates the score only for this one
100
+ >>> # That means the score may vary depending on which one was chosen
101
+
102
+ ## Citation(s)
103
+ ```bibtex
104
+ @article{scikit-learn,
105
+ title={Scikit-learn: Machine Learning in {P}ython},
106
+ author={Pedregosa, F. and Varoquaux, G. and Gramfort, A. and Michel, V.
107
+ and Thirion, B. and Grisel, O. and Blondel, M. and Prettenhofer, P.
108
+ and Weiss, R. and Dubourg, V. and Vanderplas, J. and Passos, A. and
109
+ Cournapeau, D. and Brucher, M. and Perrot, M. and Duchesnay, E.},
110
+ journal={Journal of Machine Learning Research},
111
+ volume={12},
112
+ pages={2825--2830},
113
+ year={2011}
114
+ }
nDCG.py CHANGED
@@ -46,7 +46,7 @@ Args:
46
 
47
  sample_weight (`list` of `float`): Sample weights Defaults to None.
48
 
49
- ignore_ties ('boolean'): If set to true asumes that there are no ties (this is likely if predictions are continuous)
50
  for efficiency gains. Defaults to False.
51
 
52
  Returns:
 
46
 
47
  sample_weight (`list` of `float`): Sample weights Defaults to None.
48
 
49
+ ignore_ties ('boolean'): If set to true assumes that there are no ties (this is likely if predictions are continuous)
50
  for efficiency gains. Defaults to False.
51
 
52
  Returns: