An Eval object with a data source config and testing criteria. An Eval represents a task to be done for your LLM integration. Like:

  • Improve the quality of my chatbot
  • See how well my chatbot handles customer support
  • Check if o3-mini is better at my usecase than gpt-4o
interface EvalListResponse {
    created_at: number;
    data_source_config: EvalCustomDataSourceConfig | EvalStoredCompletionsDataSourceConfig;
    id: string;
    metadata: null | Metadata;
    name: string;
    object: "eval";
    share_with_openai: boolean;
    testing_criteria: (EvalLabelModelGrader | EvalStringCheckGrader | EvalTextSimilarityGrader)[];
}

Properties

created_at: number

The Unix timestamp (in seconds) for when the eval was created.

Configuration of data sources used in runs of the evaluation.

id: string

Unique identifier for the evaluation.

metadata: null | Metadata

Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format, and querying for objects via API or the dashboard.

Keys are strings with a maximum length of 64 characters. Values are strings with a maximum length of 512 characters.

name: string

The name of the evaluation.

object: "eval"

The object type.

share_with_openai: boolean

Indicates whether the evaluation is shared with OpenAI.

A list of testing criteria.