Interface TurnDetection

Configuration for turn detection. Can be set to null to turn off. Server VAD means that the model will detect the start and end of speech based on audio volume and respond at the end of user speech.

interface TurnDetection {
    prefix_padding_ms?: number;
    silence_duration_ms?: number;
    threshold?: number;
    type?: string;
}

Index

Properties

prefix_padding_ms? silence_duration_ms? threshold? type?

Properties

`Optional`prefix_padding_ms

prefix_padding_ms?: number

Amount of audio to include before the VAD detected speech (in milliseconds). Defaults to 300ms.

`Optional`silence_duration_ms

silence_duration_ms?: number

Duration of silence to detect speech stop (in milliseconds). Defaults to 500ms. With shorter values the model will respond more quickly, but may jump in on short pauses from the user.

`Optional`threshold

threshold?: number

Activation threshold for VAD (0.0 to 1.0), this defaults to 0.5. A higher threshold will require louder audio to activate the model, and thus might perform better in noisy environments.

`Optional`type

type?: string

Type of turn detection, only server_vad is currently supported.

Interface TurnDetection

Index

Properties

Properties

`Optional`prefix_padding_ms

`Optional`silence_duration_ms

`Optional`threshold

`Optional`type

Settings

On This Page

Interface TurnDetection

Index

Properties

Properties

Optionalprefix_padding_ms

Optionalsilence_duration_ms

Optionalthreshold

Optionaltype

Settings

On This Page

`Optional`prefix_padding_ms

`Optional`silence_duration_ms

`Optional`threshold

`Optional`type