notdiamond
/

notdiamond-0001

Text Classification

Inference Endpoints

Model card Files Files and versions Community

r0ymanesco commited on Dec 12, 2023

Commit

15bc23d

•

1 Parent(s): 0bcc6dd

update readme

Files changed (1) hide show

README.md +53 -0

README.md CHANGED Viewed

@@ -1,3 +1,56 @@
 ---
 license: gpl-3.0
 ---

 ---
 license: gpl-3.0
 ---
+# notdiamond-0001
+notdiamond-0001 automatically determines whether to send queries to GPT-3.5 or GPT-4, depending on which model is best-suited for your task. We've trained notdiamond-0001 on hundreds of thousands of data points from robust, cross-domain evaluation benchmarks.
+The model is a classifier and will return either GPT-3.5 or GPT-4. You determine which version of each model you want to use and make the calls client-side with your own keys.
+To use notdiamond-0001, format your queries using the following prompt with your query appended at the end
+``` python
+    query = "Can you write a function that counts from 1 to 10?"
+    formatted_prompt = f"""I would like you to determine whether a given query should be sent to GPT-3.5 or GPT-4.
+        In general, the following types of queries should get sent to GPT-3.5:
+        Explanation
+        Summarization
+        Writing
+        Informal conversation
+        History, government, economics, literature, social studies
+        Simple math questions
+        Simple coding questions
+        Simple science questions
+        In general, the following types of queries should get sent to GPT-4:
+        Advanced coding questions
+        Advanced math questions
+        Advanced science questions
+        Legal questions
+        Medical questions
+        Sensitive/inappropriate queries
+        Your job is to determine whether the following query should be sent to GPT-3.5 or GPT-4.
+        Query:
+        {query}"""
+```
+You can then determine the model to call as follows
+``` python
+    import torch
+    from transformers import AutoTokenizer, AutoModelForSequenceClassification
+    id2label = {0: 'gpt-3.5', 1: 'gpt-4'}
+    tokenizer = AutoTokenizer.from_pretrained("r0ymanesco/notdiamond-0001")
+    model = AutoModelForSequenceClassification.from_pretrained("r0ymanesco/notdiamond-0001")
+    max_length = self._get_max_length(model)
+    inputs = tokenizer(formatted_prompt, truncation=True, max_length=max_length, return_tensors="pt")
+    logits = model(**inputs).logits
+    model_id = logits.argmax().item()
+    model_to_call = id2label[model_id]
+```
+For more details on how you can integrate this into your techstack and have notdiamond-0001 help you reduce latency and cost, check out our [documentation](https://notdiamond.readme.io/reference/introduction-1).