Pipelines: How to choose a source

A pipeline in Agentcy is a long-lived configuration that streams records into the Context Engine. Every pipeline has a source (where records come from) and a target table (context_nodes, context_edges, or context_events). Records are written through the same engine the rest of the platform reads from, so they show up in the Explore page, in agent tool calls, and in structured queries the moment they land.

This page is the decision guide. The deep dives are linked at the bottom of each section.

Pick a source

You want to …	Pick this	Latency	Setup
POST events from your service or a log shipper (Vector, Fluent Bit, Datadog Agent's HTTP output, Loki Promtail, your own backend)	HTTP Webhook	seconds	issue a token, copy the URL
Drop newline-delimited JSON files into a bucket and have them ingested automatically	File Drop	minutes (poll interval)	configure bucket + prefix
Already have a connector configured (GitHub, AWS, Slack, …) and want its scheduled sync to flow through the pipelines runner so its runs are visible in Explore	Connector source	per connector schedule	reference the connector ID
Send OpenTelemetry signals (logs / metrics / traces)	Telemetry — coming soon	seconds	OTLP endpoint per pipeline
Stream from Kafka topics	Telemetry — coming soon	seconds	topic → pipeline binding

If your data source isn't listed, HTTP Webhook is almost always the right answer. It accepts NDJSON over plain POST and works with every log shipper, application framework, and bash script ever written.

Anatomy of a pipeline

┌─────────────────┐     POST/sync     ┌──────────────┐    ingest_records    ┌────────────────┐
│   Your source   │ ─────────────────►│   Pipeline   │ ────────────────────►│ Context Engine │
│ (app / file /   │                   │   (runner)   │                      │  (Basic /      │
│  connector)     │                   │              │                      │   Advanced)    │
└─────────────────┘                   └──────┬───────┘                      └────────────────┘
                                             │
                                             ▼
                                       run history,
                                       row counts,
                                       errors

Every record pipelines write inherits the pipeline's realm (data namespace) and target_table (which physical table it lands in). Pipeline runs are recorded with rows_in, rows_out, status, and error so you can see end-to-end ingestion health on the Pipelines tab in Explore.

What lives where

Frontend — Explore → Pipelines tab (/explore?tab=pipelines)
REST API — /api/v1/context/pipelines/*
Public ingest — POST /api/v1/context/ingest/webhook/{token} (no JWT — token is the auth)
Crate — agentcy-pipelines (config + repo + runner + filedrop watcher)
Migration — 052_context_pipelines.sql

Two engines, same surface

Pipelines work on both the Basic and Advanced Context Engines. The runner calls ContextEngine::ingest_records() which the active provider implements:

Basic — translates each NDJSON record to a graph node/edge/event and writes through the embedded graph store.
Advanced — POSTs the batch to kyma's /v1/ingest (NDJSON, group-commit + idempotency ledger). Inherits kyma's exact-once semantics for free.

The pipeline's wire shape and UX are identical on both. Where realm and dedupe are enforced differs, but you don't have to think about it.

Roadmap

Source	Status
HTTP Webhook	✅ Shipped
Object-store filedrop	✅ Shipped
Connector wrap	✅ Shipped
OTLP gRPC (logs)	✅ Native — every Cloud / Advanced instance exposes a TCP-proxied OTLP endpoint. See the telemetry guide.
OTLP gRPC (metrics, traces)	🟡 On the kyma roadmap — receiver work-in-progress upstream.
Kafka consumer groups	✅ Native (env-configured topic mapping per instance).
Per-pipeline OTLP routing	🟡 Workarounds today; native `x-pipeline-token` header on the kyma roadmap.
Server-Sent Events / WebSocket push	🔴 Considered

Pipelines: How to choose a source ​

Pick a source ​

Anatomy of a pipeline ​

What lives where ​

Two engines, same surface ​

Roadmap ​