Optional
frequency_Optional
logit_JSON object that maps tokens to an associated bias value from -100 to 100.
Optional
max_The maximum number of tokens to generate in the chat completion.
A list of messages comprising the conversation so far.
The content of the message.
The role of the sender (e.g., 'user' or 'assistant').
user
- userassistant
- assistantOptional
modelID of the model to use. See the model endpoint compatibility table for details.
Optional
presence_Number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far.
The ID of the project to use.
Optional
repositoriesOptions for Retrieval Augmented Generation (RAG). Will override launched model settings
Optional
ids?: number[]The IDs of the repositories to use.
Optional
limit?: numberOptional
similarity_Optional
response_An object specifying the format that the model must output.
Optional
seedThis feature is in Beta. If specified, our system will make a best effort to sample deterministically.
Optional
session_The ID of the session to use. It helps to track the chat history.
Optional
stopUp to 4 sequences where the API will stop generating further tokens.
Optional
streamIf set, partial message deltas will be sent, like in ChatGPT.
Optional
system_The system prompt to use.
Optional
temperatureWhat sampling temperature to use, between 0 and 2.
Optional
toolsA list of tools the model may call. Currently, only functions are supported as a tool.
Optional
top_An alternative to sampling with temperature, called nucleus sampling.
Optional
userA unique identifier representing your end-user.
Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency.