Skip to content

Promoting Executions to Examples

This guide explains how to build high-quality example sets by promoting successful pipeline executions to ground truth examples.

Promoting executions offers several advantages:

  • Real-world data - Examples reflect actual production usage
  • Verified outputs - Only promote executions you’ve confirmed are correct
  • Efficient workflow - Build examples during normal review processes
  • Natural coverage - Organically captures the diversity of your use cases
  • A pipeline with execution history
  • A review process where experts verify outputs
Execute Pipeline → Expert Reviews → Verified Correct? → Promote to Example

Find executions that represent correct behavior:

List successful executions

Terminal window
curl "https://api.catalyzed.ai/pipeline-executions?pipelineId=$PIPELINE_ID&status=succeeded" \
-H "Authorization: Bearer $API_TOKEN"

If you already have an example set:

Promote to existing set

Terminal window
curl -X POST "https://api.catalyzed.ai/pipeline-executions/$EXECUTION_ID/promote-to-example" \
-H "Authorization: Bearer $API_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"existingExampleSetId": "KjR8I6rHBms3W4Qfa2-FN",
"exampleName": "Financial report - Q4 2024",
"rationale": "Correctly captured all key metrics in expected format"
}'

If this is your first example or you want a new set:

Promote with new set

Terminal window
curl -X POST "https://api.catalyzed.ai/pipeline-executions/$EXECUTION_ID/promote-to-example" \
-H "Authorization: Bearer $API_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"newExampleSet": {
"name": "Production Examples - Document Summarizer",
"description": "Ground truth from verified production executions"
},
"exampleName": "First production example",
"rationale": "Verified by senior analyst"
}'

Response:

{
"example": {
"exampleId": "ExR8I6rHBms3W4Qfa2-FN",
"exampleSetId": "KjR8I6rHBms3W4Qfa2-FN",
"name": "Financial report - Q4 2024",
"input": { ... },
"expectedOutput": { ... },
"rationale": "Correctly captured all key metrics in expected format"
},
"exampleSet": {
"exampleSetId": "KjR8I6rHBms3W4Qfa2-FN",
"name": "Production Examples - Document Summarizer"
},
"exampleSetCreated": true
}

Sometimes the execution output is close but not perfect. You can override it:

{
"existingExampleSetId": "KjR8I6rHBms3W4Qfa2-FN",
"exampleName": "Financial report - corrected",
"expectedOutput": {
"summary": "Q4 2024: Revenue up 15% YoY ($2.3M), stable churn at 3.2%."
},
"rationale": "Corrected missing churn metric from original output"
}

For promoting multiple executions at once:

async function batchPromote(executionIds: string[], exampleSetId: string) {
const results = [];
for (const executionId of executionIds) {
try {
const response = await fetch(
`https://api.catalyzed.ai/pipeline-executions/${executionId}/promote-to-example`,
{
method: "POST",
headers: {
Authorization: `Bearer ${apiToken}`,
"Content-Type": "application/json",
},
body: JSON.stringify({
existingExampleSetId: exampleSetId,
rationale: "Batch promoted - verified correct",
}),
}
);
if (response.ok) {
const result = await response.json();
results.push({ executionId, success: true, exampleId: result.example.exampleId });
} else {
const error = await response.json();
results.push({ executionId, success: false, error: error.message });
}
} catch (error) {
results.push({ executionId, success: false, error: String(error) });
}
}
return results;
}
// Usage
const verifiedExecutions = ["exec1", "exec2", "exec3"];
const results = await batchPromote(verifiedExecutions, exampleSetId);
console.log(`Promoted ${results.filter(r => r.success).length}/${results.length} executions`);

Create a clear workflow for reviewing and promoting:

1. Expert reviews execution output
2. Marks as "correct" or "needs improvement"
3. If correct → Promote to example with rationale
4. If incorrect → Create signal with feedback

Good example names help identify issues later:

BadGood
Example 1Financial report - Q4 2024 earnings
Test caseEdge case - empty document handling
Promoted executionCustomer support ticket - billing inquiry

Always include why the output is correct:

{
"rationale": "Correctly identified all 3 key action items. Used proper formatting with bullet points. Excluded irrelevant meeting chitchat."
}

Actively seek diverse examples:

  • Different input types - Various document formats, lengths
  • Edge cases - Empty inputs, unusual formats
  • Error handling - Graceful failures
  • Boundary conditions - Maximum/minimum values

When batch promoting, spot-check a sample first:

// Get a random sample of 5 to review before batch promotion
const sample = executions.sort(() => 0.5 - Math.random()).slice(0, 5);
for (const exec of sample) {
console.log("Review this execution:");
console.log(`Input: ${JSON.stringify(exec.input, null, 2)}`);
console.log(`Output: ${JSON.stringify(exec.output, null, 2)}`);
// Prompt for confirmation before including in batch
}
ErrorCauseSolution
409 ConflictExecution hasn’t succeededOnly promote succeeded executions
400 Bad RequestMissing required fieldsInclude either existingExampleSetId or newExampleSet
403 ForbiddenNo access to example setVerify team membership
// When expert reacts with :white_check_mark: to an execution notification
async function handleApprovalReaction(executionId: string, userId: string) {
const result = await promoteExecution(executionId, exampleSetId);
await sendSlackMessage(userId, `Promoted to example: ${result.example.exampleId}`);
}

Build a simple review UI:

  1. Show recent executions with input/output
  2. “Approve” button → Promote to example
  3. “Reject” button → Create signal with feedback
  4. Track approval rate over time