Appearance
Google Cloud Platform
Two paths on GCP:
| Path | When to pick it |
|---|---|
| Cloud Run + managed data | Smallest ops surface. Frontend and API run as Cloud Run services. Cold-start is acceptable for internal tools. |
| GKE + Helm | When you need pods, sidecars, finer-grained scaling, or you already run a GKE platform. |
Architecture — Cloud Run
Provision Data Layer
bash
# Cloud SQL Postgres 16 with pgvector flag
gcloud sql instances create agentcy-pg \
--database-version=POSTGRES_16 \
--tier=db-custom-2-8192 \
--region=us-central1 \
--network=default \
--no-assign-ip \
--database-flags=cloudsql.enable_pgvector=on
gcloud sql databases create agentcy --instance=agentcy-pg
gcloud sql users create agentcy --instance=agentcy-pg --password='replace-me'
# Then via psql:
# CREATE EXTENSION vector;
# CREATE EXTENSION pg_trgm;
# Memorystore Redis
gcloud redis instances create agentcy-redis \
--size=1 --region=us-central1 \
--redis-version=redis_7_0 \
--network=defaultChoose a Context Engine
The default Cloud Run setup runs the Basic Context Engine against an external AuraDB instance — pass the neo4j+s:// URI as NEO4J_URI. Free tier covers most teams; upgrade to AuraDB Pro (with a GCP-region peer for low latency) when you outgrow it.
For the Advanced Context Engine (KQL / SQL / time-series, columnar storage), deploy kyma as a third Cloud Run service:
bash
gcloud run deploy kyma \
--image=ghcr.io/agentcylabs/kyma:latest \
--region=us-central1 \
--no-allow-unauthenticated \
--vpc-connector=agentcy-vpc \
--service-account=agentcy-kyma@PROJECT.iam.gserviceaccount.com \
--cpu=2 --memory=4Gi --concurrency=40 --min-instances=1 --max-instances=4 \
--set-env-vars=KYMA_OBJECT_STORE_URL=gs://agentcy-kyma-prod,KYMA_OTLP_ADDR=0.0.0.0:4317 \
--update-secrets=KYMA_CATALOG_DATABASE_URL=agentcy-kyma-catalog:latest,KYMA_TOKEN=agentcy-kyma-token:latestThe kyma service account needs roles/storage.objectAdmin on the bucket. Then on the API service set CONTEXT_ENGINE=advanced, KYMA_BASE_URL=http://kyma.<region>.run.app, KYMA_TOKEN=<token>, KYMA_DATABASE=kyma and drop the NEO4J_* vars.
Deploy Cloud Run Services
bash
# Backend
gcloud run deploy agentcy-api \
--image=ghcr.io/agentcy/backend:latest \
--region=us-central1 \
--no-allow-unauthenticated \
--vpc-connector=agentcy-vpc \
--vpc-egress=private-ranges-only \
--service-account=agentcy-api@PROJECT.iam.gserviceaccount.com \
--cpu=1 --memory=1Gi --concurrency=80 --min-instances=1 --max-instances=10 \
--set-env-vars=BIND_ADDR=0.0.0.0:8080,LLM_PROVIDER=anthropic,AUTH_PROVIDER=local \
--update-secrets=DATABASE_URL=agentcy-db-url:latest,REDIS_URL=agentcy-redis-url:latest,NEO4J_URI=agentcy-neo4j-uri:latest,NEO4J_USER=agentcy-neo4j-user:latest,NEO4J_PASSWORD=agentcy-neo4j-password:latest,LLM_API_KEY=agentcy-llm:latest,JWT_SECRET=agentcy-jwt:latest
# Frontend
gcloud run deploy agentcy-frontend \
--image=ghcr.io/agentcy/frontend:latest \
--region=us-central1 \
--allow-unauthenticated \
--cpu=1 --memory=512Mi \
--set-env-vars=NEXT_PUBLIC_API_URL=https://agentcy.example.com/api/v1HTTPS Load Balancer
A single global HTTPS LB fronts both services with a path matcher:
| URL pattern | Backend |
|---|---|
/api/* | Serverless NEG → agentcy-api |
/* | Serverless NEG → agentcy-frontend |
This keeps both services on the same origin (no CORS), and gets you Cloud CDN, Cloud Armor, and IAP if you want zero-trust auth in front.
When to leave Cloud Run
Cloud Run is great until you need:
- Sub-100 ms cold start guarantees
- More than 80 concurrent requests per instance with heavy WebSocket / streaming traffic
- Sidecars (e.g. dedicated metric exporters) — Cloud Run supports them but management is awkward
At that point, switch to GKE.
Architecture — GKE
GKE deployments follow the generic Kubernetes guide with these GCP-specific notes:
| Topic | Approach |
|---|---|
| Ingress | kubernetes.io/ingress.class: gce for Google's Ingress, or install ingress-nginx if you prefer it |
| TLS | Google-managed cert via ManagedCertificate CR |
| DNS | gcloud dns records or external-dns |
| Secrets | External Secrets Operator → Secret Manager |
| IAM | Workload Identity instead of node-bound SAs |
| Autoscaling | HPA + Cluster Autoscaler (GKE Standard) or just Autopilot |
Workload Identity Binding
bash
gcloud iam service-accounts add-iam-policy-binding \
agentcy-api@PROJECT.iam.gserviceaccount.com \
--role roles/iam.workloadIdentityUser \
--member "serviceAccount:PROJECT.svc.id.goog[agentcy/agentcy-api]"
kubectl annotate serviceaccount agentcy-api \
--namespace agentcy \
iam.gke.io/gcp-service-account=agentcy-api@PROJECT.iam.gserviceaccount.comThe API pod can now call Secret Manager and Cloud SQL Auth Proxy without keys.
Cloud SQL Auth Proxy Sidecar
Either mount the proxy as a sidecar in the API deployment, or run it cluster-wide as a Deployment and point the API at its ClusterIP. The Helm chart supports both:
yaml
api:
cloudSqlProxy:
enabled: true
instance: my-project:us-central1:agentcy-pgCost Estimate (Cloud Run path)
| Item | Approx. monthly |
|---|---|
| Cloud Run (idle minimums + traffic) | $20–60 |
| HTTPS Load Balancer | $20 |
Cloud SQL db-custom-2-8192 | ~$95 |
| Memorystore Redis 1 GB | ~$50 |
| AuraDB Free | $0 |
| Secret Manager | ~$1 |
| Total | ~$190/mo |
GKE Standard with 3 small nodes adds ~$120/mo for the cluster + nodes vs. Cloud Run.
Troubleshooting
Cloud Run service can't reach Cloud SQL
Make sure the service has a Serverless VPC connector (--vpc-connector) and the Cloud SQL instance has a private IP. Public-IP Cloud SQL works too but adds latency and lock-in to public network egress.
pgvector not available
Confirm cloudsql.enable_pgvector=on is set on the instance flags, and that you ran CREATE EXTENSION vector; against the agentcy database (not postgres).
GKE pods stuck Pending on PVC
For bundled databases on GKE, the chart asks for ReadWriteOnce PVCs. Ensure a default StorageClass exists (pd-balanced is fine for production).
Next Steps
- Kubernetes — generic
- Architecture & Tech Stack — managed-service equivalents