Get Evaluation Run

{
  "id": "a03fa2f4-900d-482d-afe0-470d4cd8d1f4",
  "agent_id": "basic-agent",
  "model_id": "gpt-4o",
  "model_provider": "OpenAI",
  "name": "Test ",
  "eval_type": "reliability",
  "eval_data": {
    "eval_status": "PASSED",
    "failed_tool_calls": [],
    "passed_tool_calls": [
      "multiply"
    ]
  },
  "eval_input": {
    "expected_tool_calls": [
      "multiply"
    ]
  },
  "created_at": "2025-08-27T15:41:59Z",
  "updated_at": "2025-08-27T15:41:59Z"
}

GET

eval-runs

{eval_run_id}

{
  "id": "a03fa2f4-900d-482d-afe0-470d4cd8d1f4",
  "agent_id": "basic-agent",
  "model_id": "gpt-4o",
  "model_provider": "OpenAI",
  "name": "Test ",
  "eval_type": "reliability",
  "eval_data": {
    "eval_status": "PASSED",
    "failed_tool_calls": [],
    "passed_tool_calls": [
      "multiply"
    ]
  },
  "eval_input": {
    "expected_tool_calls": [
      "multiply"
    ]
  },
  "created_at": "2025-08-27T15:41:59Z",
  "updated_at": "2025-08-27T15:41:59Z"
}

Authorizations

Authorization

string

header

required

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Path Parameters

eval_run_id

string

required

Query Parameters

db_id

string | null

The ID of the database to use

table

string | null

Table to query eval run from

Response

Evaluation run details retrieved successfully

string

required

Unique identifier for the evaluation run

eval_type

enum<string>

required

Type of evaluation (accuracy, performance, or reliability)

Available options:

accuracy,

performance,

reliability

eval_data

object

required

Evaluation results and metrics

agent_id

string | null

Agent ID that was evaluated

model_id

string | null

Model ID used in evaluation

model_provider

string | null

Model provider name

team_id

string | null

Team ID that was evaluated

workflow_id

string | null

Workflow ID that was evaluated

name

string | null

Name of the evaluation run

evaluated_component_name

string | null

Name of the evaluated component

eval_input

object | null

Input parameters used for the evaluation

created_at

string<date-time> | null

Timestamp when evaluation was created

updated_at

string<date-time> | null

Timestamp when evaluation was last updated

List Evaluation Runs Execute Evaluation

⌘I

Agno SDK Reference

AgentOS API Reference

Get Evaluation Run

Authorizations

Path Parameters

Query Parameters

Response