Spaces:

JP-SystemsX
/

nDCG

Runtime error

App Files Files Community

nDCG / README.md

JP-SystemsX

Minor tweaks

220984d about 2 years ago

preview code

raw

history blame

4.77 kB

	---
	title: nDCG
	emoji: 👁
	colorFrom: red
	colorTo: blue
	sdk: gradio
	sdk_version: 3.9.1
	app_file: app.py
	pinned: false
	license: mit
	tags:
	- evaluate
	- metric
	- ranking

	description: >-
	The Discounted Cumulative Gain is a measure of ranking quality.
	It is used to evaluate Information Retrieval Systems under the following 2 assumptions:
	1. Highly relevant documents/Labels are more useful when appearing earlier in the results
	2. Documents/Labels are relevant to different degrees
	It is defined as the Sum over all relevances of the retrieved documents reduced logarithmically proportional to
	the position in which they were retrieved.
	The Normalized DCG (nDCG) divides the resulting value by the best possible value to get a value between
	0 and 1 s.t. a perfect retrieval achieves a nDCG of 1.
	---

	# Metric Card for nDCG

	## Metric Description
	The Discounted Cumulative Gain is a measure of ranking quality.
	It is used to evaluate Information Retrieval Systems under the 2 assumptions:
	1. Highly relevant documents/Labels are more useful when appearing earlier in the results
	2. Documents/Labels are relevant to different degrees

	It is defined as the sum over all relevances of the retrieved documents reduced logarithmically proportional to
	the position in which they were retrieved.
	The Normalized DCG (nDCG) divides the resulting value by the optimal value that can be achieved to get a value between
	0 and 1 s.t. a perfect retrieval achieves a nDCG of 1.0

	## How to Use

	At minimum, this metric takes as input two `list`s of `list`s, each containing `float`s: predictions and references.

	```python
	import evaluate
	nDCG_metric = evaluate.load('JP-SystemsX/nDCG')
	results = nDCG_metric.compute(references=[[0, 1]], predictions=[[0, 1]])
	print(results)
	["{'nDCG@2': 1.0}"]
	```

	### Inputs:
	references (`list` of `float`): True relevance

	predictions (`list` of `float`): Either predicted relevance, probability estimates or confidence values

	k (`int`): If set to a value only the k highest scores in the ranking will be considered, else considers all outputs.
	Defaults to None.

	sample_weight (`list` of `float`): Sample weights Defaults to None.

	ignore_ties (`boolean`): If set to true, assumes that there are no ties (this is likely if predictions are continuous)
	for efficiency gains. Defaults to False.

	### Output:
	normalized_discounted_cumulative_gain (`float`): The averaged nDCG scores for all samples.
	Minimum possible value is 0.0 Maximum possible value is 1.0

	Output Example(s):
	```python
	{'nDCG@5': 1.0}
	{'nDCG': 0.876}
	```
	This metric outputs a dictionary, containing the nDCG score


	### Examples:
	Example 1-A simple example
	>>> nDCG_metric = evaluate.load("JP-SystemsX/nDCG")
	>>> results = nDCG_metric.compute(references=[[10, 0, 0, 1, 5]], predictions=[[.1, .2, .3, 4, 70]])
	>>> print(results)
	{'nDCG': 0.6956940443813076}
	Example 2-The same as Example 1, except with k set to 3.
	>>> nDCG_metric = evaluate.load("JP-SystemsX/nDCG")
	>>> results = nDCG_metric.compute(references=[[10, 0, 0, 1, 5]], predictions=[[.1, .2, .3, 4, 70]], k=3)
	>>> print(results)
	{'nDCG@3': 0.4123818817534531}
	Example 3-There is only one relevant label, but there is a tie and the model can't decide which one is the one.
	>>> accuracy_metric = evaluate.load("accuracy")
	>>> results = nDCG_metric.compute(references=[[1, 0, 0, 0, 0]], predictions=[[1, 1, 0, 0, 0]], k=1)
	>>> print(results)
	{'nDCG@1': 0.5}
	>>> #That is it calculates both and returns the average of both
	Example 4-The Same as 3, except ignore_ties is set to True.
	>>> accuracy_metric = evaluate.load("accuracy")
	>>> results = nDCG_metric.compute(references=[[1, 0, 0, 0, 0]], predictions=[[1, 1, 0, 0, 0]], k=1, ignore_ties=True)
	>>> print(results)
	{'nDCG@1': 0.0}
	>>> # Alternative Result: {'nDCG@1': 1.0}
	>>> # That is it chooses one of the 2 candidates and calculates the score only for this one
	>>> # That means the score may vary depending on which one was chosen

	## Citation(s)
	```bibtex
	@article{scikit-learn,
	title={Scikit-learn: Machine Learning in {P}ython},
	author={Pedregosa, F. and Varoquaux, G. and Gramfort, A. and Michel, V.
	and Thirion, B. and Grisel, O. and Blondel, M. and Prettenhofer, P.
	and Weiss, R. and Dubourg, V. and Vanderplas, J. and Passos, A. and
	Cournapeau, D. and Brucher, M. and Perrot, M. and Duchesnay, E.},
	journal={Journal of Machine Learning Research},
	volume={12},
	pages={2825--2830},
	year={2011}
	}