The input tokens detailed information for the image generation.
The number of image tokens in the input prompt.
The number of text tokens in the input prompt.
The input tokens detailed information for the image generation.