Configuration for input audio transcription. The client can optionally set the language and prompt for transcription, these offer additional guidance to the transcription service.

interface InputAudioTranscription {
    language?: string;
    model?: "whisper-1" | "gpt-4o-transcribe" | "gpt-4o-mini-transcribe";
    prompt?: string;
}

Properties

language?: string

The language of the input audio. Supplying the input language in ISO-639-1 (e.g. en) format will improve accuracy and latency.

model?: "whisper-1" | "gpt-4o-transcribe" | "gpt-4o-mini-transcribe"

The model to use for transcription, current options are gpt-4o-transcribe, gpt-4o-mini-transcribe, and whisper-1.

prompt?: string

An optional text to guide the model's style or continue a previous audio segment. For whisper-1, the prompt is a list of keywords. For gpt-4o-transcribe models, the prompt is a free text string, for example "expect words related to technology".