Appearance
Pipelines: How to choose a source
A pipeline in Agentcy is a long-lived configuration that streams records into the Context Engine. Every pipeline has a source (where records come from) and a target table (context_nodes, context_edges, or context_events). Records are written through the same engine the rest of the platform reads from, so they show up in the Explore page, in agent tool calls, and in structured queries the moment they land.
This page is the decision guide. The deep dives are linked at the bottom of each section.
Pick a source
| You want to … | Pick this | Latency | Setup |
|---|---|---|---|
| POST events from your service or a log shipper (Vector, Fluent Bit, Datadog Agent's HTTP output, Loki Promtail, your own backend) | HTTP Webhook | seconds | issue a token, copy the URL |
| Drop newline-delimited JSON files into a bucket and have them ingested automatically | File Drop | minutes (poll interval) | configure bucket + prefix |
| Already have a connector configured (GitHub, AWS, Slack, …) and want its scheduled sync to flow through the pipelines runner so its runs are visible in Explore | Connector source | per connector schedule | reference the connector ID |
| Send OpenTelemetry signals (logs / metrics / traces) | Telemetry — coming soon | seconds | OTLP endpoint per pipeline |
| Stream from Kafka topics | Telemetry — coming soon | seconds | topic → pipeline binding |
If your data source isn't listed, HTTP Webhook is almost always the right answer. It accepts NDJSON over plain POST and works with every log shipper, application framework, and bash script ever written.
Anatomy of a pipeline
┌─────────────────┐ POST/sync ┌──────────────┐ ingest_records ┌────────────────┐
│ Your source │ ─────────────────►│ Pipeline │ ────────────────────►│ Context Engine │
│ (app / file / │ │ (runner) │ │ (Basic / │
│ connector) │ │ │ │ Advanced) │
└─────────────────┘ └──────┬───────┘ └────────────────┘
│
▼
run history,
row counts,
errorsEvery record pipelines write inherits the pipeline's realm (data namespace) and target_table (which physical table it lands in). Pipeline runs are recorded with rows_in, rows_out, status, and error so you can see end-to-end ingestion health on the Pipelines tab in Explore.
What lives where
- Frontend — Explore → Pipelines tab (
/explore?tab=pipelines) - REST API —
/api/v1/context/pipelines/* - Public ingest —
POST /api/v1/context/ingest/webhook/{token}(no JWT — token is the auth) - Crate —
agentcy-pipelines(config + repo + runner + filedrop watcher) - Migration —
052_context_pipelines.sql
Two engines, same surface
Pipelines work on both the Basic and Advanced Context Engines. The runner calls ContextEngine::ingest_records() which the active provider implements:
- Basic — translates each NDJSON record to a graph node/edge/event and writes through the embedded graph store.
- Advanced — POSTs the batch to kyma's
/v1/ingest(NDJSON, group-commit + idempotency ledger). Inherits kyma's exact-once semantics for free.
The pipeline's wire shape and UX are identical on both. Where realm and dedupe are enforced differs, but you don't have to think about it.
Roadmap
| Source | Status |
|---|---|
| HTTP Webhook | ✅ Shipped |
| Object-store filedrop | ✅ Shipped |
| Connector wrap | ✅ Shipped |
| OTLP gRPC (logs) | ✅ Native — every Cloud / Advanced instance exposes a TCP-proxied OTLP endpoint. See the telemetry guide. |
| OTLP gRPC (metrics, traces) | 🟡 On the kyma roadmap — receiver work-in-progress upstream. |
| Kafka consumer groups | ✅ Native (env-configured topic mapping per instance). |
| Per-pipeline OTLP routing | 🟡 Workarounds today; native x-pipeline-token header on the kyma roadmap. |
| Server-Sent Events / WebSocket push | 🔴 Considered |