List evaluations

Authorizations

bearerAuth
cookieAuth

Query Parameters

evaluationIds

Comma-separated list of evaluation IDs to filter by

string

nullable

Comma-separated list of evaluation IDs to filter by

teamIds

Comma-separated list of team IDs to filter by

string

nullable

Comma-separated list of team IDs to filter by

pipelineIds

Comma-separated list of pipeline IDs to filter by

string

nullable

Comma-separated list of pipeline IDs to filter by

exampleSetIds

Comma-separated list of example set IDs to filter by

string

nullable

Comma-separated list of example set IDs to filter by

statuses

Comma-separated list of statuses to filter by (pending, running, succeeded, failed, cancelled)

string

nullable

Comma-separated list of statuses to filter by (pending, running, succeeded, failed, cancelled)

createdBy

Filter by user who created the evaluation

string

Filter by user who created the evaluation

createdAfter

Filter to evaluations created after this date

string format: date-time

nullable

Filter to evaluations created after this date

createdBefore

Filter to evaluations created before this date

string format: date-time

nullable

Filter to evaluations created before this date

page

Page number for pagination (1-indexed)

number

>= 1

Page number for pagination (1-indexed)

pageSize

Number of results per page (1-100, default: 20)

number

>= 1 <= 100

Number of results per page (1-100, default: 20)

orderBy

Field to sort results by

string

Allowed values: createdAt completedAt status aggregateScore

Field to sort results by

orderDirection

Sort direction: ascending or descending

string

Allowed values: asc desc

Sort direction: ascending or descending

200

Evaluations retrieved successfully

object

evaluations

required

Array of evaluations matching the query

Array<object>

object

evaluationId

required

Unique identifier for the evaluation (nanoid format)

string

teamId

required

ID of the team that owns this evaluation

string

pipelineId

required

ID of the pipeline being evaluated

string

pipelineConfigurationId

required

Pipeline configuration snapshot at evaluation time

string

exampleSetId

required

ID of the example set used for evaluation

string

exampleSetConfigurationId

required

Example set configuration snapshot at evaluation time

string

evaluatorType

required

Evaluation method: ‘llm_judge’, ‘exact_match’, or ‘semantic’

string

Allowed values: llm_judge exact_match semantic

evaluatorConfig

Evaluator-specific configuration

nullable

mappingType

required

How inputs/outputs are mapped: ‘explicit’

string

Allowed values: explicit

mappingConfig

required

Slot-to-slot mappings between example set and pipeline

object

inputMappings

required

Maps example input slots to pipeline input slots

Array<object>

object

exampleSlotId

required

Slot ID from the example set schema

string

pipelineSlotId

required

Slot ID from the pipeline schema

string

outputMappings

required

Maps example output slots to pipeline output slots for comparison

Array<object>

object

exampleSlotId

required

Slot ID from the example set schema

string

pipelineSlotId

required

Slot ID from the pipeline schema

string

status

required

Status: ‘pending’, ‘running’, ‘succeeded’, ‘failed’, or ‘cancelled’

string

Allowed values: pending running succeeded failed cancelled

totalExamples

required

Total examples in evaluation

number

nullable

passedCount

required

Number of passed examples

number

nullable

failedCount

required

Number of failed examples

number

nullable

errorCount

required

Number of examples with execution errors

number

nullable

skippedCount

required

Number of skipped examples

number

nullable

aggregateScore

required

Overall evaluation score (0.0 to 1.0)

number

nullable

summary

required

LLM-generated summary (deferred feature)

string

nullable

jobId

required

ID of the job processing this evaluation

string

nullable

errorMessage

required

Error message if evaluation failed

string

nullable

startedAt

required

When evaluation started (ISO 8601)

string format: date-time

nullable

completedAt

required

When evaluation completed (ISO 8601)

string format: date-time

nullable

createdAt

required

When the evaluation was created (ISO 8601)

string format: date-time

createdBy

required

ID of the user who triggered the evaluation

string

nullable

total

required

Total number of evaluations matching the query (before pagination)

number

page

required

Current page number

number

pageSize

required

Number of results per page

number

400

Bad Request - Validation error or invalid input

object

error

required

string

code

string

details

nullable

retryable

boolean

401

Unauthorized - Authentication required or invalid token

object

error

required

string

code

string

details

nullable

retryable

boolean

403

Forbidden - Insufficient permissions

object

error

required

string

code

string

details

nullable

retryable

boolean

List evaluations

Authorizations

Parameters

Query Parameters

Responses

200

400

401

403