Authorizations
Bearer authentication header of the form Bearer <token>, where <token> is your auth token.
Query Parameters
Database ID to use for evaluation
Table to use for evaluation
Body
Type of evaluation to run (accuracy, performance, or reliability)
accuracy, performance, reliability Input text/query for the evaluation
1Agent ID to evaluate
Team ID to evaluate
Model ID to use for evaluation
Model provider name
Additional guidelines for the evaluation
Additional context for the evaluation
Number of times to run the evaluation
1 <= x <= 100Name for this evaluation run
Expected output for accuracy evaluation
Number of warmup runs before measuring performance
0 <= x <= 10Expected tool calls for reliability evaluation
Response
Evaluation executed successfully
Unique identifier for the evaluation run
Type of evaluation (accuracy, performance, or reliability)
accuracy, performance, reliability Evaluation results and metrics
Agent ID that was evaluated
Model ID used in evaluation
Model provider name
Team ID that was evaluated
Workflow ID that was evaluated
Name of the evaluation run
Name of the evaluated component
Input parameters used for the evaluation
Timestamp when evaluation was created
Timestamp when evaluation was last updated