Confidential — North American B2B SaaS
K8s Migration & Cost Optimisation for a B2B SaaS
- Double-digit %
- Cloud-cost reduction
- 99.99%
- SLA achieved
- 8 weeks
- Migration duration
- Zero customer-visible incidents
- Status
Project details
The Challenge
The client is a North American B2B SaaS in the supply-chain visibility space, serving mid-market and enterprise customers. They had grown to meaningful scale on a stack the founding engineering team had built — ECS Fargate, a single RDS Postgres, and a handful of Lambda functions glued together with EventBridge. The cloud bill had grown to a number that was alarming the board and consuming margin they needed to reinvest into product.
Worse, the architecture was no longer keeping pace. Fargate cold starts on infrequently-triggered services were giving customers occasional multi-second response times. The single RDS instance was vertically scaled near its ceiling. And the team had committed to enterprise prospects that they could deliver a four-nines SLA — a promise the current setup could not credibly support.
Our Approach
The brief was to migrate to a Kubernetes-based architecture, materially reduce cloud cost, and reach a credible 99.99% SLA — all without a customer-visible incident or a feature-development freeze longer than four weeks. Eight-week engagement.
We migrated to EKS with Karpenter as the autoscaler. Karpenter over Cluster Autoscaler was a deliberate choice: the workload mix (long-running Go services, batch ML jobs, scheduled report generation) benefited disproportionately from Karpenter's just-in-time, instance-type-aware provisioning. The cluster runs a mixed node pool — a portion on-demand for stateful and system workloads, the majority on spot for the stateless application tier — with consolidation enabled at a tight threshold.
Cost optimisations applied in order:
- VPA in recommendation mode for two weeks before any migration of compute, revealing that CPU requests across the fleet were significantly over-provisioned and memory similarly so.
- Graviton (arm64) node pool for the services where the dependency tree compiled cleanly, with a notable per-node cost reduction on those workloads.
- KEDA scale-to-zero for the batch and scheduled services that previously ran 24/7 on Fargate.
- RDS migration to Aurora Postgres with a read-replica per AZ, and the heaviest analytical queries moved off the primary onto a logical-replication-fed ClickHouse instance.
- S3 Intelligent Tiering plus an aggressive lifecycle policy on the team's accumulated log archive — terabytes moved into Glacier Deep Archive at a fraction of the previous cost.
For the SLA work, we built a proper SRE foundation: error budgets per service, defined SLOs reviewed weekly, golden-signal dashboards in Grafana, multi-AZ deployment of every stateful component, automated chaos testing using AWS Fault Injection Simulator, and a runbook-driven incident-response process replacing the previous "page someone in Slack" approach.
Migration was zero-downtime via a service-by-service traffic shift behind an AWS App Mesh proxy. We moved more than two dozen services over the engagement, with buffer time at the end for SLA hardening and game days.
The Outcome
Monthly AWS spend dropped by a double-digit percentage, comfortably exceeding the original target. Total cluster size measured in vCPU-hours is meaningfully smaller than the equivalent Fargate compute footprint, driven by VPA right-sizing and Karpenter's bin-packing consolidation. Cold-start latency is effectively zero now that services run on persistently-warm pods, and the heaviest customer-facing endpoints have moved from p95 latencies measured in seconds to ones measured in low hundreds of milliseconds.
The SLA target has been met for many consecutive months. The enterprise prospects that prompted the work signed, and the client has used the new architecture as a selling point in further enterprise pursuits since. The platform-engineering team owns and operates the cluster confidently, with our SRE pod on retainer for the genuinely hard incidents.
Capabilities used
Services that powered this project
Next project
Digital Media Consultancy — Lahore & Islamabad, Pakistan