Post
Hey GPT, check yourself...
Here is a black-box method for hallucination detection that shows strong correlation with human annotations. 🔥
💡 The idea is the following: ask GPT, or any other powerful LLM, to sample multiple answers for the same prompt, and then ask it if these answers align with the statements in the original output. Make it say yes/no and measure the frequency with which the generated samples support the original statements.
This method is called SelfCheckGPT with Prompt and shows very nice results. 👀
The downside, we have to do many LLM calls just to evaluate a single generated paragraph... 🙃
More details and variations of this method are in the paper: SelfCheck: Using LLMs to Zero-Shot Check Their Own Step-by-Step Reasoning (2308.00436)
Here is a black-box method for hallucination detection that shows strong correlation with human annotations. 🔥
💡 The idea is the following: ask GPT, or any other powerful LLM, to sample multiple answers for the same prompt, and then ask it if these answers align with the statements in the original output. Make it say yes/no and measure the frequency with which the generated samples support the original statements.
This method is called SelfCheckGPT with Prompt and shows very nice results. 👀
The downside, we have to do many LLM calls just to evaluate a single generated paragraph... 🙃
More details and variations of this method are in the paper: SelfCheck: Using LLMs to Zero-Shot Check Their Own Step-by-Step Reasoning (2308.00436)