The method used for fine-tuning.

interface Method {
    dpo?: DpoMethod;
    reinforcement?: ReinforcementMethod;
    supervised?: SupervisedMethod;
    type: "supervised" | "dpo" | "reinforcement";
}

Properties

dpo?: DpoMethod

Configuration for the DPO fine-tuning method.

reinforcement?: ReinforcementMethod

Configuration for the reinforcement fine-tuning method.

supervised?: SupervisedMethod

Configuration for the supervised fine-tuning method.

type: "supervised" | "dpo" | "reinforcement"

The type of method. Is either supervised, dpo, or reinforcement.