Interface VertexAIInput

Input to a Google Vertex AI LLM class.

interface VertexAIInput {
    apiConfig?: GoogleAIAPIConfig;
    apiKey?: string;
    apiName?: string;
    apiVersion?: string;
    authOptions?: GoogleAuthOptions<AnyAuthClient>;
    cache?: boolean | BaseCache<Generation[]>;
    callbackManager?: CallbackManager;
    callbacks?: Callbacks;
    concurrency?: number;
    convertSystemMessageToHumanContent?: boolean;
    endpoint?: string;
    frequencyPenalty?: number;
    labels?: Record<string, string>;
    location?: string;
    logprobs?: boolean;
    maxConcurrency?: number;
    maxOutputTokens?: number;
    maxReasoningTokens?: number;
    maxRetries?: number;
    metadata?: Record<string, unknown>;
    model?: string;
    modelName?: string;
    onFailedAttempt?: FailedAttemptHandler;
    platformType?: GooglePlatformType;
    presencePenalty?: number;
    reasoningEffort?: "low" | "medium" | "high";
    responseMimeType?: GoogleAIResponseMimeType;
    responseModalities?: string[];
    safetyHandler?: GoogleAISafetyHandler;
    safetySettings?: GoogleAISafetySetting[];
    seed?: number;
    speechConfig?: GoogleSpeechConfig | GoogleSpeechConfigSimplified;
    stopSequences?: string[];
    streaming?: boolean;
    tags?: string[];
    temperature?: number;
    thinkingBudget?: number;
    topK?: number;
    topLogprobs?: number;
    topP?: number;
    verbose?: boolean;
    vertexai?: boolean;
}

Hierarchy

GoogleLLMInput
- VertexAIInput

Properties

`Optional`apiConfig

apiConfig?: GoogleAIAPIConfig

`Optional`apiKey

apiKey?: string

Some APIs allow an API key instead

`Optional`apiName

apiName?: string

`Optional`apiVersion

apiVersion?: string

The version of the API functions. Part of the path.

`Optional`authOptions

authOptions?: GoogleAuthOptions<AnyAuthClient>

`Optional`cache

cache?: boolean | BaseCache<Generation[]>

`Optional`callbackManager

callbackManager?: CallbackManager

Deprecated

Use callbacks instead

`Optional`callbacks

callbacks?: Callbacks

`Optional`concurrency

concurrency?: number

Deprecated

Use maxConcurrency instead

`Optional`convertSystemMessageToHumanContent

convertSystemMessageToHumanContent?: boolean

`Optional`endpoint

endpoint?: string

Hostname for the API call (if this is running on GCP)

`Optional`frequencyPenalty

frequencyPenalty?: number

Frequency penalty applied to the next token's logprobs, multiplied by the number of times each token has been seen in the respponse so far. A positive penalty will discourage the use of tokens that have already been used, proportional to the number of times the token has been used: The more a token is used, the more dificult it is for the model to use that token again increasing the vocabulary of responses. Caution: A negative penalty will encourage the model to reuse tokens proportional to the number of times the token has been used. Small negative values will reduce the vocabulary of a response. Larger negative values will cause the model to start repeating a common token until it hits the maxOutputTokens limit.

`Optional`labels

labels?: Record<string, string>

Custom metadata labels to associate with the request. Only supported on Vertex AI (Google Cloud Platform). Labels are key-value pairs where both keys and values must be strings.

Example:

{
  labels: {
    "team": "research",
    "component": "frontend",
    "environment": "production"
  }
}

`Optional`location

location?: string

Region where the LLM is stored (if this is running on GCP)

`Optional`logprobs

logprobs?: boolean

Whether to return log probabilities of the output tokens or not. If true, returns the log probabilities of each output token returned in the content of message.

`Optional`maxConcurrency

maxConcurrency?: number

The maximum number of concurrent calls that can be made. Defaults to Infinity, which means no limit.

`Optional`maxOutputTokens

maxOutputTokens?: number

Maximum number of tokens to generate in the completion. This may include reasoning tokens (for backwards compatibility).

`Optional`maxReasoningTokens

maxReasoningTokens?: number

The maximum number of the output tokens that will be used for the "thinking" or "reasoning" stages.

`Optional`maxRetries

maxRetries?: number

The maximum number of retries that can be made for a single call, with an exponential backoff between each attempt. Defaults to 6.

`Optional`metadata

metadata?: Record<string, unknown>

`Optional`model

model?: string

Model to use

`Optional`modelName

modelName?: string

Model to use Alias for model

`Optional`onFailedAttempt

onFailedAttempt?: FailedAttemptHandler

Custom handler to handle failed attempts. Takes the originally thrown error object as input, and should itself throw an error if the input error is not retryable.

`Optional`platformType

platformType?: GooglePlatformType

What platform to run the service on. If not specified, the class should determine this from other means. Either way, the platform actually used will be in the "platform" getter.

`Optional`presencePenalty

presencePenalty?: number

Presence penalty applied to the next token's logprobs if the token has already been seen in the response. This penalty is binary on/off and not dependant on the number of times the token is used (after the first). Use frequencyPenalty for a penalty that increases with each use. A positive penalty will discourage the use of tokens that have already been used in the response, increasing the vocabulary. A negative penalty will encourage the use of tokens that have already been used in the response, decreasing the vocabulary.

`Optional`reasoningEffort

reasoningEffort?: "low" | "medium" | "high"

An OpenAI compatible parameter that will map to "maxReasoningTokens"

`Optional`responseMimeType

responseMimeType?: GoogleAIResponseMimeType

Available for gemini-1.5-pro. The output format of the generated candidate text. Supported MIME types:

text/plain: Text output.
application/json: JSON response in the candidates.

Default

"text/plain"

`Optional`responseModalities

responseModalities?: string[]

The modalities of the response.

`Optional`safetyHandler

safetyHandler?: GoogleAISafetyHandler

`Optional`safetySettings

safetySettings?: GoogleAISafetySetting[]

`Optional`seed

seed?: number

Seed used in decoding. If not set, the request uses a randomly generated seed.

`Optional`speechConfig

speechConfig?: GoogleSpeechConfig | GoogleSpeechConfigSimplified

Speech generation configuration. You can use either Google's definition of the speech configuration, or a simplified version we've defined (which can be as simple as the name of a pre-defined voice).

`Optional`stopSequences

stopSequences?: string[]

`Optional`streaming

streaming?: boolean

Whether or not to stream.

Default

false

`Optional`tags

tags?: string[]

`Optional`temperature

temperature?: number

Sampling temperature to use

`Optional`thinkingBudget

thinkingBudget?: number

An alias for "maxReasoningTokens"

`Optional`topK

topK?: number

Top-k changes how the model selects tokens for output.

A top-k of 1 means the selected token is the most probable among all tokens in the model’s vocabulary (also called greedy decoding), while a top-k of 3 means that the next token is selected from among the 3 most probable tokens (using temperature).

`Optional`topLogprobs

topLogprobs?: number

An integer between 0 and 5 specifying the number of most likely tokens to return at each token position, each with an associated log probability. logprobs must be set to true if this parameter is used.

`Optional`topP

topP?: number

Top-p changes how the model selects tokens for output.

Tokens are selected from most probable to least until the sum of their probabilities equals the top-p value.

For example, if tokens A, B, and C have a probability of .3, .2, and .1 and the top-p value is .5, then the model will select either A or B as the next token (using temperature).

`Optional`verbose

verbose?: boolean

`Optional`vertexai

vertexai?: boolean

For compatibility with Google's libraries, should this use Vertex? The "platformType" parmeter takes precedence.

Interface VertexAIInput

Hierarchy

Index

Properties

Properties

OptionalapiConfig

OptionalapiKey

OptionalapiName

OptionalapiVersion

OptionalauthOptions

Optionalcache

OptionalcallbackManager

Deprecated

Optionalcallbacks

Optionalconcurrency

Deprecated

OptionalconvertSystemMessageToHumanContent

Optionalendpoint

OptionalfrequencyPenalty

Optionallabels

Optionallocation

Optionallogprobs

OptionalmaxConcurrency

OptionalmaxOutputTokens

OptionalmaxReasoningTokens

OptionalmaxRetries

Optionalmetadata

Optionalmodel

OptionalmodelName

OptionalonFailedAttempt

OptionalplatformType

OptionalpresencePenalty

OptionalreasoningEffort

OptionalresponseMimeType

Default

OptionalresponseModalities

OptionalsafetyHandler

OptionalsafetySettings

Optionalseed

OptionalspeechConfig

OptionalstopSequences

Optionalstreaming

Default

Optionaltags

Optionaltemperature

OptionalthinkingBudget

OptionaltopK

OptionaltopLogprobs

OptionaltopP

Optionalverbose

Optionalvertexai

Settings

On This Page

`Optional`apiConfig

`Optional`apiKey

`Optional`apiName

`Optional`apiVersion

`Optional`authOptions

`Optional`cache

`Optional`callbackManager

`Optional`callbacks

`Optional`concurrency

`Optional`convertSystemMessageToHumanContent

`Optional`endpoint

`Optional`frequencyPenalty

`Optional`labels

`Optional`location

`Optional`logprobs

`Optional`maxConcurrency

`Optional`maxOutputTokens

`Optional`maxReasoningTokens

`Optional`maxRetries

`Optional`metadata

`Optional`model

`Optional`modelName

`Optional`onFailedAttempt

`Optional`platformType

`Optional`presencePenalty

`Optional`reasoningEffort

`Optional`responseMimeType

`Optional`responseModalities

`Optional`safetyHandler

`Optional`safetySettings

`Optional`seed

`Optional`speechConfig

`Optional`stopSequences

`Optional`streaming

`Optional`tags

`Optional`temperature

`Optional`thinkingBudget

`Optional`topK

`Optional`topLogprobs

`Optional`topP

`Optional`verbose

`Optional`vertexai