Skip to content
  • Auto
  • Light
  • Dark
Log in to API

Create

Create
post/chat/completions

Generate a chat completion for the given messages using the specified model.

Body Parameters
messagesarray of Message

List of messages in the conversation.

modelstring

The identifier of the model to use.

max_completion_tokensnumber
optional

The maximum number of tokens to generate.

minimum1
repetition_penaltynumber
optional

Controls the likelyhood and generating repetitive responses.

minimum1
maximum2
response_formatunion
optional

An object specifying the format that the model must output. Setting to { "type": "json_schema", "json_schema": {...} } enables Structured Outputs which ensures the model will match your supplied JSON schema. If not specified, the default is {"type": "text"}, and model will return a free-form text response.

One of the following 2 object variants:
Hide ParametersShow Parameters
JsonSchemaobject

Configuration for JSON schema-guided response generation.

Hide ParametersShow Parameters
json_schemaobject

The JSON schema the response should conform to.

Hide ParametersShow Parameters
namestring

The name of the response format.

schemaunknown

The JSON schema the response should conform to. In a Python SDK, this is often a pydantic model.

typeenum
"json_schema"

The type of response format being defined. Always json_schema.

Hide ParametersShow Parameters
"json_schema"
Textobject

Configuration for text-guided response generation.

Hide ParametersShow Parameters
typeenum
"text"

The type of response format being defined. Always text.

Hide ParametersShow Parameters
"text"
streamboolean
optional

If True, generate an SSE event stream of the response. Defaults to False.

temperaturenumber
optional

Controls randomness of the response by setting a temperature. Higher value leads to more creative responses. Lower values will make the response more focused and deterministic.

minimum0
maximum1
tool_choiceunion
optional
"none" OR "auto" OR "required" OR object

Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool.

none is the default when no tools are present. auto is the default if tools are present.

Hide ParametersShow Parameters
UnionMember0enum

none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.

Hide ParametersShow Parameters
"none"
"auto"
"required"
ChatCompletionNamedToolChoiceobject

Specifies a tool the model should use. Use to force the model to call a specific function.

Hide ParametersShow Parameters
functionobject
Hide ParametersShow Parameters
namestring

The name of the function to call.

typeenum
"function"

The type of the tool. Currently, only function is supported.

Hide ParametersShow Parameters
"function"
toolsarray of object
optional

List of tool definitions available to the model

Hide ParametersShow Parameters
functionobject
Hide ParametersShow Parameters
namestring

The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

descriptionstring
optional

A description of what the function does, used by the model to choose when and how to call the function.

parametersmap
optional

The parameters the functions accepts, described as a JSON Schema object. Omitting parameters defines a function with an empty parameter list.

strictboolean
optional

Whether to enable strict schema adherence when generating the function call. If set to true, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is true. Learn more about Structured Outputs in the function calling guide.

typeenum
"function"

The type of the tool. Currently, only function is supported.

Hide ParametersShow Parameters
"function"
top_knumber
optional

Only sample from the top K options for each subsequent token.

minimum0
top_pnumber
optional

Controls diversity of the response by setting a probability threshold when choosing the next token.

minimum0
maximum1
userstring
optional

A unique identifier representing your application end-user for monitoring abuse.

Returns
completion_messageCompletionMessageidstringmetricsarray of objectCreateChatCompletionResponse
curl https://api.llama.com/v1/chat/completions \
    -H 'Content-Type: application/json' \
    -H "Authorization: Bearer $LLAMA_API_KEY" \
    -d '{
      "messages": [
        {
          "content": "string",
          "role": "user"
        }
      ],
      "model": "model"
    }'
200 Example
{
  "completion_message": {
    "role": "assistant",
    "content": "string",
    "stop_reason": "stop",
    "tool_calls": [
      {
        "id": "id",
        "function": {
          "arguments": "arguments",
          "name": "name"
        }
      }
    ]
  },
  "id": "id",
  "metrics": [
    {
      "metric": "metric",
      "value": 0,
      "unit": "unit"
    }
  ]
}