The duration of the input audio.
The language of the output translation (always english).
english
Optional
Segments of the translated text and their corresponding details.
The translated text.
The duration of the input audio.