Create

post/chat/completions

Generate a chat completion for the given messages using the specified model.

Body Parameters

messagesarray of

List of messages in the conversation.

modelstring

The identifier of the model to use.

max_completion_tokensnumber

optional

The maximum number of tokens to generate.

minimum1

repetition_penaltynumber

optional

Controls the likelyhood and generating repetitive responses.

minimum1

maximum2

response_formatunion

optional

An object specifying the format that the model must output. Setting to { "type": "json_schema", "json_schema": {...} } enables Structured Outputs which ensures the model will match your supplied JSON schema. If not specified, the default is {"type": "text"}, and model will return a free-form text response.

One of the following 2 object variants:

Hide ParametersShow Parameters

JsonSchemaobject

Configuration for JSON schema-guided response generation.

Hide ParametersShow Parameters

json_schemaobject

The JSON schema the response should conform to.

Hide ParametersShow Parameters

namestring

The name of the response format.

schemaunknown

The JSON schema the response should conform to. In a Python SDK, this is often a pydantic model.

typeenum

"json_schema"

The type of response format being defined. Always json_schema.

Hide ParametersShow Parameters

"json_schema"

Textobject

Configuration for text-guided response generation.

Hide ParametersShow Parameters

typeenum

"text"

The type of response format being defined. Always text.

Hide ParametersShow Parameters

"text"

streamboolean

optional

If True, generate an SSE event stream of the response. Defaults to False.

temperaturenumber

optional

Controls randomness of the response by setting a temperature. Higher value leads to more creative responses. Lower values will make the response more focused and deterministic.

minimum0

maximum1

tool_choiceunion

optional

"none" OR "auto" OR "required" OR object

Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool.

none is the default when no tools are present. auto is the default if tools are present.

Hide ParametersShow Parameters

UnionMember0enum

none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.

Hide ParametersShow Parameters

"none"

"auto"

"required"

ChatCompletionNamedToolChoiceobject

Specifies a tool the model should use. Use to force the model to call a specific function.

Hide ParametersShow Parameters

functionobject

Hide ParametersShow Parameters

namestring

The name of the function to call.

typeenum

"function"

The type of the tool. Currently, only function is supported.

Hide ParametersShow Parameters

"function"

toolsarray of object

optional

List of tool definitions available to the model

Hide ParametersShow Parameters

functionobject

Hide ParametersShow Parameters

namestring

The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

descriptionstring

optional

A description of what the function does, used by the model to choose when and how to call the function.

parametersmap

optional

The parameters the functions accepts, described as a JSON Schema object. Omitting parameters defines a function with an empty parameter list.

strictboolean

optional

Whether to enable strict schema adherence when generating the function call. If set to true, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is true. Learn more about Structured Outputs in the function calling guide.

typeenum

"function"

The type of the tool. Currently, only function is supported.

Hide ParametersShow Parameters

"function"

top_knumber

optional

Only sample from the top K options for each subsequent token.

minimum0

top_pnumber

optional

Controls diversity of the response by setting a probability threshold when choosing the next token.

minimum0

maximum1

userstring

optional

A unique identifier representing your application end-user for monitoring abuse.

Returns

curl https://api.llama.com/v1/chat/completions \
    -H 'Content-Type: application/json' \
    -H "Authorization: Bearer $LLAMA_API_KEY" \
    -d '{
      "messages": [
        {
          "content": "string",
          "role": "user"
        }
      ],
      "model": "model"
    }'

200 Example

{
  "completion_message": {
    "role": "assistant",
    "content": "string",
    "stop_reason": "stop",
    "tool_calls": [
      {
        "id": "id",
        "function": {
          "arguments": "arguments",
          "name": "name"
        }
      }
    ]
  },
  "id": "id",
  "metrics": [
    {
      "metric": "metric",
      "value": 0,
      "unit": "unit"
    }
  ]
}