kd-shared/fineweb-CC-MAIN-2023-50-and-CC-MAIN-2024-10-meta-llama_Llama-2-7b-hf Updated May 19, 2024 • 7
I Don't Know: Explicit Modeling of Uncertainty with an [IDK] Token Paper • 2412.06676 • Published Dec 9, 2024 • 9
konstantindobler/mistral7b-de-tokenizer-swap-pure-bf16-v2-anneal-ablation Text Generation • Updated Aug 23, 2024 • 8
konstantindobler/mistral7b-ar-tokenizer-swap-pure-bf16-anneal-ablation Text Generation • Updated Aug 23, 2024 • 16