Honkware
/

Wizard-Vicuna-13B-Uncensored-SpQR

Text Generation

Inference Endpoints

Model card Files Files and versions Community

Wizard-Vicuna-13B-Uncensored-SpQR

Overview

This model is an SpQR 4-bit quantization of the original Wizard-Vicuna-13B-Uncensored-HF

Quantization Specifications

Quantization: 4-bit, group size of 16, per-channel with scale and zero-point of 3 bits.
Outliers: Threshold set at 0.2.
Permutation Order: act_order.
Dampening: Set at 1e0.
Sampling: 128 samples.
Logging: Via Weights & Biases.

Evaluation Metrics

The following perplexity scores were obtained on various datasets:

Dataset	Perplexity
c4	7.354
wikitext2	5.685
ptb	20.822

Downloads last month: 11

Inference Examples

Text Generation

This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Evaluation results

perplexity on C4
self-reported

7.354
perplexity on WikiText-2
self-reported

5.685
perplexity on PTB
self-reported

20.822

View on Papers With Code