Commit
·
4447566
1
Parent(s):
58bf923
tidy
Browse files
app.py
CHANGED
@@ -177,11 +177,11 @@ if __name__ == "__main__":
|
|
177 |
)
|
178 |
gr.Markdown(
|
179 |
"""
|
180 |
-
One of the major claims of the <a href="https://arxiv.org/abs/2311.00430"> Distil-Whisper paper
|
181 |
that Distil-Whisper hallucinates less than Whisper on long-form audio. To demonstrate this, we'll analyse the
|
182 |
-
transcriptions generated by <a href="https://huggingface.co/openai/whisper-large-v2"> Whisper
|
183 |
-
and <a href="https://huggingface.co/distil-whisper/distil-large-v2"> Distil-Whisper
|
184 |
-
<a href="https://huggingface.co/datasets/distil-whisper/tedlium-long-form"> TED-LIUM
|
185 |
|
186 |
To quantify the amount of repetition and hallucination in the predicted transcriptions, we measure the number
|
187 |
of repeated 5-gram word duplicates (5-Dup.) and the insertion error rate (IER). Analysis is performed on the
|
|
|
177 |
)
|
178 |
gr.Markdown(
|
179 |
"""
|
180 |
+
One of the major claims of the <a href="https://arxiv.org/abs/2311.00430"> Distil-Whisper paper</a> is that
|
181 |
that Distil-Whisper hallucinates less than Whisper on long-form audio. To demonstrate this, we'll analyse the
|
182 |
+
transcriptions generated by <a href="https://huggingface.co/openai/whisper-large-v2"> Whisper</a>
|
183 |
+
and <a href="https://huggingface.co/distil-whisper/distil-large-v2"> Distil-Whisper</a> on the
|
184 |
+
<a href="https://huggingface.co/datasets/distil-whisper/tedlium-long-form"> TED-LIUM</a> validation set.
|
185 |
|
186 |
To quantify the amount of repetition and hallucination in the predicted transcriptions, we measure the number
|
187 |
of repeated 5-gram word duplicates (5-Dup.) and the insertion error rate (IER). Analysis is performed on the
|