wilcoxon / README.md
lvwerra's picture
lvwerra HF staff
Update Space (evaluate main: 3cd38e2b)
d64af6b

A newer version of the Gradio SDK is available: 4.44.0

Upgrade
metadata
title: Wilcoxon
emoji: 🤗
colorFrom: blue
colorTo: green
sdk: gradio
sdk_version: 3.0.2
app_file: app.py
pinned: false
tags:
  - evaluate
  - comparison
description: Wilcoxon's test is a signed-rank test for comparing paired samples.

Comparison Card for Wilcoxon

Comparison description

Wilcoxon's test is a non-parametric signed-rank test that tests whether the distribution of the differences is symmetric about zero. It can be used to compare the predictions of two models.

How to use

The Wilcoxon comparison is used to analyze paired ordinal data.

Inputs

Its arguments are:

predictions1: a list of predictions from the first model.

predictions2: a list of predictions from the second model.

Output values

The Wilcoxon comparison outputs two things:

stat: The Wilcoxon statistic.

p: The p value.

Examples

Example comparison:

wilcoxon = evaluate.load("wilcoxon")
results = wilcoxon.compute(predictions1=[-7, 123.45, 43, 4.91, 5], predictions2=[1337.12, -9.74, 1, 2, 3.21])
print(results)
{'stat': 5.0, 'p': 0.625}

Limitations and bias

The Wilcoxon test is a non-parametric test, so it has relatively few assumptions (basically only that the observations are independent). It should be used to analyze paired ordinal data only.

Citations

@incollection{wilcoxon1992individual,
  title={Individual comparisons by ranking methods},
  author={Wilcoxon, Frank},
  booktitle={Breakthroughs in statistics},
  pages={196--202},
  year={1992},
  publisher={Springer}
}