Skip to content

Pipelines

Pipelines are automated workflows that transform, enrich, and process your data. They allow you to build reproducible data processing logic that runs on demand or on a schedule.

A pipeline is a sequence of steps that:

  • Read data from one or more sources
  • Apply transformations and business logic
  • Write results to a destination

Pipelines are defined declaratively and executed by Catalyzed’s distributed computation infrastructure.

Pipelines consist of ordered steps, each performing a specific operation:

  • SQL Transforms - Filter, join, and aggregate data using SQL
  • Python Transforms - Custom Python code for complex logic

Pipelines can be executed:

  • Manually - On-demand via UI or API
  • Scheduled - Run on a cron schedule
  • Event-driven - Triggered by file uploads or other events
Terminal window
curl -X POST https://api.catalyzed.ai/pipelines \
-H "Authorization: Bearer YOUR_API_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"teamId": "YOUR_TEAM_ID",
"name": "my-pipeline",
"description": "Transform and enrich data"
}'
Terminal window
curl -X POST https://api.catalyzed.ai/pipelines/{pipelineId}/trigger \
-H "Authorization: Bearer YOUR_API_TOKEN"

Track pipeline runs through the executions API:

Terminal window
curl https://api.catalyzed.ai/pipeline-executions?pipelineId={pipelineId} \
-H "Authorization: Bearer YOUR_API_TOKEN"

See the Pipelines API for complete endpoint documentation.