Configuration for input audio transcription, defaults to off and can be set to null to turn off once on. Input audio transcription is not native to the model, since the model consumes audio directly. Transcription runs asynchronously through OpenAI Whisper transcription and should be treated as rough guidance rather than the representation understood by the model. The client can optionally set the language and prompt for transcription, these fields will be passed to the Whisper API.

interface InputAudioTranscription {
    language?: string;
    model?: string;
    prompt?: string;
}

Properties

language?: string

The language of the input audio. Supplying the input language in ISO-639-1 (e.g. en) format will improve accuracy and latency.

model?: string

The model to use for transcription, whisper-1 is the only currently supported model.

prompt?: string

An optional text to guide the model's style or continue a previous audio segment. The prompt should match the audio language.