Appearance
AWS — EKS
Use this when your org has standardized on Kubernetes. The base setup follows the generic Kubernetes guide; this page covers the AWS-specific extras: ALB ingress, IRSA, External DNS, EBS-backed PVs, and how to wire managed databases.
Reference Architecture
Cluster Prerequisites
These add-ons make life easier — install them once per cluster:
| Add-on | Purpose | Install |
|---|---|---|
| AWS Load Balancer Controller | Provisions ALBs from Ingress resources | helm install ... aws-load-balancer-controller |
| EBS CSI Driver | Backs PVCs for self-hosted Postgres / Neo4j / Redis | EKS managed add-on |
| External DNS | Auto-creates Route 53 records from Ingress | helm install external-dns |
| External Secrets Operator | Pulls secrets from AWS Secrets Manager | helm install external-secrets |
| cert-manager | TLS certs for any ingress not on ALB | helm install cert-manager |
| Karpenter (optional) | Bin-packs nodes for cost | helm install karpenter |
Service Account → IAM (IRSA)
Agentcy uses IRSA so the API pod can read secrets and (optionally) talk to AWS-only connectors (S3, Bedrock, etc.) without long-lived keys.
bash
eksctl create iamserviceaccount \
--cluster agentcy \
--namespace agentcy \
--name agentcy-api \
--attach-policy-arn arn:aws:iam::aws:policy/SecretsManagerReadWrite \
--approveFor finer-grained access, attach a custom policy listing the exact secret ARNs and any connector resources.
Managed Databases
RDS Postgres + pgvector
Provision RDS Postgres 16 in a private subnet group. After creation:
sql
-- via psql
CREATE EXTENSION IF NOT EXISTS vector;
CREATE EXTENSION IF NOT EXISTS pg_trgm;Open port 5432 from the EKS node security group (or use security-group rules on the RDS security group referencing the SG attached by EKS).
ElastiCache Redis
A cache.t4g.small single-node cluster is plenty for Team-tier traffic. For HA pick a Multi-AZ replication group.
Context Engine — Basic (Neo4j / AuraDB)
Create an AuraDB project and instance in a region peered to your VPC. Pass the neo4j+s:// URI through to the API pods.
If you must keep the Basic graph in-cluster, the chart's bundled Neo4j subchart supports EBS-backed persistence. Don't run Community-edition Neo4j across multiple replicas — it isn't designed for that.
Context Engine — Advanced (kyma)
The Helm chart ships a kyma subchart you enable with kyma.enabled: true. It runs as a StatefulSet (or Deployment, since compute is stateless) and writes Arrow extents to an S3 bucket the chart can provision via the AWS Load Balancer Controller's IRSA annotation, or that you create out-of-band.
yaml
contextEngine: advanced
kyma:
enabled: true
image:
repository: ghcr.io/agentcylabs/kyma
tag: latest
replicaCount: 2
resources:
requests:
cpu: 500m
memory: 1Gi
limits:
cpu: 2000m
memory: 4Gi
objectStore:
type: s3
bucket: agentcy-kyma-prod
region: us-east-1
irsaRoleArn: arn:aws:iam::123:role/agentcy-kyma-s3
catalog:
# kyma catalog lives in the same RDS instance, separate database
databaseSecretRef:
name: agentcy-kyma-catalog
key: url
otlp:
enabled: true
nlb: true # exposes :4317 via an NLB so emitters can ship OTLP directlyThe API pod gets CONTEXT_ENGINE=advanced and points at http://kyma:8080. Reads stream over Arrow Flight gRPC; KQL/SQL/Cypher all work against the same engine. See getkyma.dev for OTLP/Kafka/file-drop ingest paths.
Helm Values for EKS
yaml
ingress:
enabled: true
className: alb
annotations:
kubernetes.io/ingress.class: alb
alb.ingress.kubernetes.io/scheme: internet-facing
alb.ingress.kubernetes.io/target-type: ip
alb.ingress.kubernetes.io/listen-ports: '[{"HTTPS":443},{"HTTP":80}]'
alb.ingress.kubernetes.io/ssl-redirect: '443'
alb.ingress.kubernetes.io/certificate-arn: arn:aws:acm:us-east-1:123:certificate/xxxxx
external-dns.alpha.kubernetes.io/hostname: agentcy.example.com
hosts:
- host: agentcy.example.com
paths:
- path: /api
backend: api
- path: /
backend: frontend
api:
serviceAccount:
create: false
name: agentcy-api # IRSA-bound SA from eksctl above
podAnnotations:
eks.amazonaws.com/skip-containers: "envoy"
# Bundled databases off — using managed services
postgresql: { enabled: false }
neo4j: { enabled: false }
redis: { enabled: false }
externalSecrets:
enabled: true
secretStoreRef:
name: aws-secrets
kind: ClusterSecretStore
remoteSecrets:
databaseUrl: { remoteRef: agentcy/postgres-url }
redisUrl: { remoteRef: agentcy/redis-url }
neo4jUri: { remoteRef: agentcy/neo4j, property: uri }
neo4jUser: { remoteRef: agentcy/neo4j, property: username }
neo4jPassword:{ remoteRef: agentcy/neo4j, property: password }
llmApiKey: { remoteRef: agentcy/llm-api-key }
jwtSecret: { remoteRef: agentcy/jwt }A ClusterSecretStore you only define once:
yaml
apiVersion: external-secrets.io/v1beta1
kind: ClusterSecretStore
metadata:
name: aws-secrets
spec:
provider:
aws:
service: SecretsManager
region: us-east-1
auth:
jwt:
serviceAccountRef:
name: external-secrets-sa
namespace: external-secretsInstall
bash
helm repo add agentcy https://charts.agentcylabs.com
helm repo update
kubectl create namespace agentcy
helm install agentcy agentcy/agentcy \
--namespace agentcy \
--values eks-values.yamlAuto-scaling
The chart enables HPA on the API by default. On EKS, add Karpenter (or the Cluster Autoscaler) so new nodes appear when HPA scales up:
yaml
api:
autoscaling:
enabled: true
minReplicas: 2
maxReplicas: 12
targetCPUUtilizationPercentage: 65
targetMemoryUtilizationPercentage: 75TLS, Domains, Multi-tenant
- One domain per environment: simplest. Set
ingress.hosts[0].hostand let External DNS create the Route 53 record. - Wildcard for white-label: ACM wildcard cert (
*.app.example.com), settls.hosts: ["*.app.example.com"], and the API'stenant_resolvermatches the host header.
CI/CD
GitOps with Argo CD or Flux pointed at a manifests repo is the cleanest pattern. Generate manifests via helm template and commit them, or use Argo's Helm plugin.
For pull-based image updates use Flux Image Automation or Keel — both will bump the image tag in your manifests when a new version is pushed.
Cost Notes
| Component | Indicative monthly cost (us-east-1) |
|---|---|
| EKS control plane | $73 |
3 × m6g.large nodes (24/7) | ~$135 |
| ALB | ~$20 |
RDS db.t4g.medium | ~$60 |
ElastiCache cache.t4g.small | ~$25 |
| AuraDB Pro (8 GB) | ~$65 |
| Floor | ~$380/mo |
Karpenter typically cuts node cost 30–50% for bursty agent workloads.
Troubleshooting
Ingress provisions but ALB target health is unhealthy
The AWS LB Controller hits the pod IPs directly via target-type: ip. Confirm the pod readiness probe passes (kubectl get pods -n agentcy) and that the node SG allows traffic from the ALB SG.
403 from EBS CSI when creating PVCs
Attach the AmazonEBSCSIDriverPolicy to the node group's IAM role, or install the EBS CSI as a managed add-on with its own IRSA role.
Pod has unbound immediate PersistentVolumeClaims
Set a storageClassName (e.g., gp3) — EKS clusters don't have a default StorageClass unless the EBS CSI add-on installs one.
Secrets Manager AccessDenied
The agentcy-api ServiceAccount needs secretsmanager:GetSecretValue on the specific secret ARNs. Check kubectl describe pod — the AWS SDK will say which secret it failed to read.
Next Steps
- Architecture & Tech Stack — what each managed service replaces and why
- Kubernetes — generic — values reference
- AWS ECS — if EKS is overkill