Optional
languageThe language of the input audio. Supplying the input language in
ISO-639-1 (e.g. en
)
format will improve accuracy and latency.
Optional
modelThe model to use for transcription, current options are gpt-4o-transcribe
,
gpt-4o-mini-transcribe
, and whisper-1
.
Optional
promptAn optional text to guide the model's style or continue a previous audio
segment. For whisper-1
, the
prompt is a list of keywords.
For gpt-4o-transcribe
models, the prompt is a free text string, for example
"expect words related to technology".
Configuration for input audio transcription, defaults to off and can be set to
null
to turn off once on. Input audio transcription is not native to the model, since the model consumes audio directly. Transcription runs asynchronously through the /audio/transcriptions endpoint and should be treated as guidance of input audio content rather than precisely what the model heard. The client can optionally set the language and prompt for transcription, these offer additional guidance to the transcription service.