Not working in inference api. Goes in timeout after 120 sec.
import sys
import requests
from kaggle_secrets import UserSecretsClient
user_secrets = UserSecretsClient()
YOUR_API_KEY = user_secrets.get_secret("YOUR_API_KEY")
if YOUR_API_KEY == "":
sys.exit("API key not found in secrets.")
API_URL = "https://api-inference.huggingface.co/models/Salesforce/codegen-16B-mono"
headers = {"Authorization": f"Bearer {YOUR_API_KEY}"}
def query(payload):
response = requests.post(API_URL, headers=headers, json=payload)
return response.json()
prompt="""def download_file(url, directory):
"""
This function downloads a file from a URL and saves it to a user-specified directory using the filename from the end of the URL.
Args:
url (str): The URL of the file to be downloaded.
directory (str): The directory where the file should be saved. If the directory does not exist, it will be created.
Raises:
ValueError: If the URL is not valid.
Returns:
None
""""""
pre_prompt="""Q:\n\nComplete the code of the following function:\n\n"""
post_prompt="\n\nA:\n\n"
output = query({
"inputs": pre_prompt+prompt+post_prompt,
"parameters": {"temperature": 0.1,
"repetition_penalty": 1.1,
"max_new_tokens":250,
"max_time":120,
"return_full_text":False,
"num_return_sequences":1,
"do_sample":True,
},
"options": {"use_cache":False,
"wait_for_model":True,
},
})
if type(output) == list:
generated_text = output[0]['generated_text']
else:
sys.exit(output['error'])
stop_seq='\n\n\n'
stop_idx = generated_text.find(stop_seq)
if stop_idx != -1:
generated_text=generated_text[:stop_idx].strip()
else:
generated_text=generated_text.strip()
print(post_prompt+generated_text)
Your code is a bit unformatted but there might be an error when you define prompt
: After """def download_file(url, directory):
you have an additional """
in the next line which closes the string. Thus the next lines are Python interpreted.
Other than that, I also have a time out: I specified "options": {"wait_for_model": True}
in the API request and after some time the function returns but the response.json()[0]['generated_text']
has the following output:
'Error:M o d e l S a l e s f o r c e / c o d e g e n - 1 6 B - m o n o t i m e o u t'
I suppose the model is too large for the inference API, see https://discuss.huggingface.co/t/cannot-run-large-models-using-api-token/31844/2