Skip to content

Datasets

Datasets are logical containers that group related tables together. They help organize your data by domain, project, or use case.

  • Team: The organizational boundary (billing, access control)
  • Dataset: Groups related tables (e.g., “Sales Data”, “Customer Analytics”)
  • Table: Contains data with a defined schema
  • Rows: Individual records within a table

Create a dataset

Terminal window
curl -X POST https://api.catalyzed.ai/datasets \
-H "Authorization: Bearer $API_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"teamId": "ZkoDMyjZZsXo4VAO_nJLk",
"name": "Sales Analytics",
"description": "Sales data for analytics and reporting"
}'

Response:

{
"datasetId": "HoIEJNIPiQIy6TjVRxjwz",
"teamId": "ZkoDMyjZZsXo4VAO_nJLk",
"name": "Sales Analytics",
"description": "Sales data for analytics and reporting",
"tags": [],
"metadata": {},
"managed": false,
"createdAt": "2024-01-15T10:30:00Z",
"updatedAt": "2024-01-15T10:30:00Z"
}

List datasets in a team

Terminal window
curl "https://api.catalyzed.ai/datasets?teamIds=ZkoDMyjZZsXo4VAO_nJLk" \
-H "Authorization: Bearer $API_TOKEN"

The list endpoint supports filtering, pagination, and sorting:

ParameterTypeDescription
teamIdsstringComma-separated team IDs to filter by
datasetIdsstringComma-separated dataset IDs to filter by
namestringFilter by name (partial match)
tagsstringComma-separated tags to filter by
managedbooleanFilter by managed status (true or false)
pagenumberPage number (starts at 1, default: 1)
pageSizenumberResults per page (1-100, default: 20)
orderBystringSort by: createdAt, name, updatedAt, description
orderDirectionstringSort direction: asc or desc
Terminal window
curl "https://api.catalyzed.ai/datasets?teamIds=ZkoDMyjZZsXo4VAO_nJLk&page=1&pageSize=10&orderBy=name&orderDirection=asc" \
-H "Authorization: Bearer $API_TOKEN"

Get dataset by ID

Terminal window
curl https://api.catalyzed.ai/datasets/HoIEJNIPiQIy6TjVRxjwz \
-H "Authorization: Bearer $API_TOKEN"

Update dataset

Terminal window
curl -X PUT https://api.catalyzed.ai/datasets/HoIEJNIPiQIy6TjVRxjwz \
-H "Authorization: Bearer $API_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"name": "Sales Analytics v2",
"description": "Updated sales data with 2024 metrics"
}'

Delete dataset

Terminal window
curl -X DELETE https://api.catalyzed.ai/datasets/HoIEJNIPiQIy6TjVRxjwz \
-H "Authorization: Bearer $API_TOKEN"
FieldTypeDescription
datasetIdstringUnique identifier
teamIdstringTeam that owns this dataset
namestringHuman-readable name (1-255 characters)
descriptionstring | nullDescription (nullable)
tagsstring[]Tags for organization (defaults to [])
metadataobjectCustom key-value metadata (defaults to {})
managedbooleanWhether the dataset is system-managed
createdAtstringISO 8601 timestamp of creation
updatedAtstringISO 8601 timestamp of last modification

Once you have a dataset, you can create tables within it. See Tables for details on:

  • Defining table schemas
  • Inserting and querying data
  • Managing indexes
  • Schema migrations

Group tables by business domain to keep related data together. This approach works well when different teams or functions own distinct data areas.

Separate datasets by environment to isolate production data from development and staging. Each environment contains the same table structure, making it easy to promote changes through your deployment pipeline.