NexaAIDev
/

Octopus-v2-gguf-awq

Text Generation

function calling

on-device language model

text-generation-inference

4-bit precision

Model card Files Files and versions Community

zackli4ai commited on May 7

Commit

750c6d3

•

1 Parent(s): a5724b4

Update README.md

Files changed (1) hide show

README.md +7 -23

README.md CHANGED Viewed

@@ -19,36 +19,20 @@ language:
 This repo includes two types of quantized models: **GGUF** and **AWQ**, for ourOctopus V2 model at [NexaAIDev/Octopus-v2](https://huggingface.co/NexaAIDev/Octopus-v2)
-# GGUF Qauntization
-## Run with [Ollama](https://github.com/ollama/ollama)
 ```bash
 ollama run NexaAIDev/octopus-v2-Q4_K_M
 ```
-Input example:
-```json
-def get_trending_news(category=None, region='US', language='en', max_results=5):
-    """
-    Fetches trending news articles based on category, region, and language.
-    Parameters:
-    - category (str, optional): News category to filter by, by default use None for all categories. Optional to provide.
-    - region (str, optional): ISO 3166-1 alpha-2 country code for region-specific news, by default, uses 'US'. Optional to provide.
-    - language (str, optional): ISO 639-1 language code for article language, by default uses 'en'. Optional to provide.
-    - max_results (int, optional): Maximum number of articles to return, by default, uses 5. Optional to provide.
-    Returns:
-    - list[str]: A list of strings, each representing an article. Each string contains the article's heading and URL.
-    """
-```
 # AWQ Quantization
-Input Python example:
 ```python
 from awq import AutoAWQForCausalLM
@@ -102,7 +86,7 @@ for prompt in prompts:
 print("avg throughput:", np.mean(avg_throughput))
 ```
-## Quantized GGUF & AWQ Models
 | Name                   | Quant method | Bits | Size     | Response (t/s) | Use Cases                           |
 | ---------------------- | ------------ | ---- | -------- | -------------- | ----------------------------------- |

 This repo includes two types of quantized models: **GGUF** and **AWQ**, for ourOctopus V2 model at [NexaAIDev/Octopus-v2](https://huggingface.co/NexaAIDev/Octopus-v2)
+<p align="center" width="100%">
+  <a><img src="Octopus-logo.jpeg" alt="nexa-octopus" style="width: 40%; min-width: 300px; display: block; margin: auto;"></a>
+</p>
+# GGUF Qauntization
+Run with [Ollama](https://github.com/ollama/ollama)
 ```bash
 ollama run NexaAIDev/octopus-v2-Q4_K_M
 ```
 # AWQ Quantization
+Python example:
 ```python
 from awq import AutoAWQForCausalLM
 print("avg throughput:", np.mean(avg_throughput))
 ```
+# Quantized GGUF & AWQ Models Benchmark
 | Name                   | Quant method | Bits | Size     | Response (t/s) | Use Cases                           |
 | ---------------------- | ------------ | ---- | -------- | -------------- | ----------------------------------- |