Skip to content

Get evaluation by ID

GET
/evaluations/{evaluationId}
evaluationId
required

Unique identifier for the evaluation to retrieve

string

Unique identifier for the evaluation to retrieve

Evaluation found

object
evaluationId
required

Unique identifier for the evaluation (nanoid format)

string
teamId
required

ID of the team that owns this evaluation

string
pipelineId
required

ID of the pipeline being evaluated

string
pipelineConfigurationId
required

Pipeline configuration snapshot at evaluation time

string
exampleSetId
required

ID of the example set used for evaluation

string
exampleSetConfigurationId
required

Example set configuration snapshot at evaluation time

string
evaluatorType
required

Evaluation method: ‘llm_judge’, ‘exact_match’, or ‘semantic’

string
Allowed values: llm_judge exact_match semantic
evaluatorConfig

Evaluator-specific configuration

nullable
mappingType
required

How inputs/outputs are mapped: ‘explicit’

string
Allowed values: explicit
mappingConfig
required

Slot-to-slot mappings between example set and pipeline

object
inputMappings
required

Maps example input slots to pipeline input slots

Array<object>
object
exampleSlotId
required

Slot ID from the example set schema

string
pipelineSlotId
required

Slot ID from the pipeline schema

string
outputMappings
required

Maps example output slots to pipeline output slots for comparison

Array<object>
object
exampleSlotId
required

Slot ID from the example set schema

string
pipelineSlotId
required

Slot ID from the pipeline schema

string
status
required

Status: ‘pending’, ‘running’, ‘succeeded’, ‘failed’, or ‘cancelled’

string
Allowed values: pending running succeeded failed cancelled
totalExamples
required

Total examples in evaluation

number
nullable
passedCount
required

Number of passed examples

number
nullable
failedCount
required

Number of failed examples

number
nullable
errorCount
required

Number of examples with execution errors

number
nullable
skippedCount
required

Number of skipped examples

number
nullable
aggregateScore
required

Overall evaluation score (0.0 to 1.0)

number
nullable
summary
required

LLM-generated summary (deferred feature)

string
nullable
jobId
required

ID of the job processing this evaluation

string
nullable
errorMessage
required

Error message if evaluation failed

string
nullable
startedAt
required

When evaluation started (ISO 8601)

string format: date-time
nullable
completedAt
required

When evaluation completed (ISO 8601)

string format: date-time
nullable
createdAt
required

When the evaluation was created (ISO 8601)

string format: date-time
createdBy
required

ID of the user who triggered the evaluation

string
nullable

Bad Request - Validation error or invalid input

object
error
required
string
code
string
details
nullable
retryable
boolean

Unauthorized - Authentication required or invalid token

object
error
required
string
code
string
details
nullable
retryable
boolean

Forbidden - Insufficient permissions

object
error
required
string
code
string
details
nullable
retryable
boolean

Not Found - Resource does not exist

object
error
required
string
code
string
details
nullable
retryable
boolean