benk04
/

CausalLM-RP-34B-4.65bpw-h6-exl2

Text Generation

Not-For-All-Audiences

nsfw

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Configuration Parsing Warning: In config.json: "quantization_config.bits" must be an integer

Exllamav2 4.65bpw quantization of CausalLM-RP-34B from NeverSleep, quantized with default calibration dataset.

Fits in 24GB VRAM with 32k+ context. Make sure to enable 4-bit cache option or you'll run into OOM errors.

Original Card

Description

This repo contains fp16 files of CausalLM-RP-34B, a finetuned model of the CausalLM-34B Beta on multiple RP datasets.

Model used

CausalLM/34b-beta

Prompt template ChatML

<|im_start|>system
{system_prompt}<|im_end|>
<|im_start|>user
{prompt}<|im_end|>
<|im_start|>assistant
{output}<|im_end|>

Downloads last month: 7

Inference Providers NEW

Text Generation

This model is not currently available via any of the supported Inference Providers.