I want to use LLaMA and OPT via the Hugging Face API. However, I always get the same two error messages. Either
ConnectionError: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))
Or
{'error': 'Model decapoda-research/llama-65b-hf time out'}
This is my code:
import requests
API_URL = "https://api-inference.huggingface.co/models/decapoda-research/llama-65b-hf"
headers = {"Authorization": ###}
def query(payload):
response = requests.post(API_URL, headers=headers, json=payload)
return response.json()
output = query({"inputs": "Once upon a time ","options":{"wait_for_model":True},})
print(output)
What am I doing wrong or what can I do to actually access the models?