Work · Logistics · 16 weeks · 2023
Consolidating 28 microservices into 6 bounded contexts
A freight-matching platform's architecture had outgrown its domain model. Twenty-eight services with overlapping responsibilities produced call graphs with cycles, and routine feature work required coordinated deploys across five or six teams.
The situation
Leadership had correctly diagnosed the problem as a domain-modeling failure rather than a Kubernetes failure, which is why the team reached out to us rather than to a platform consultancy. The request was specific: identify the real bounded contexts, propose a service topology aligned to them, and implement the consolidation for the two most painful subdomains.
The pain was concentrated in shipment and pricing. A simple feature — "add a new surcharge tier for hazmat loads" — required changes in six repositories, coordinated across four teams, and a deploy sequence that had been wrong twice in the prior quarter, each time producing a brief pricing outage.
What we did
The first four weeks were domain work: interviews with fourteen engineers and four product managers, mapping of actual call patterns via OpenTelemetry trace data from a representative two-week window, and identification of aggregate roots through an event-storming session that filled a conference room wall. The output was a six-context model — quote, shipment, pricing, carrier, invoicing, and events — that we pressure-tested against a year of feature requests to check that the seams were durable. Two features didn't fit; we adjusted the model until they did.
Implementation focused on the shipment and pricing contexts, which together accounted for roughly 60% of cross-service calls. Each was consolidated into a single Go service with an internal event bus: Temporal workflows for long-running processes (a shipment can legitimately live for 72 hours), and in-process messaging for synchronous work. The other four contexts were left as boundary proposals for the client's teams to execute on their own timeline.
The consolidation was done by strangler-fig, not by big-bang. For eleven weeks the new shipment service ran alongside the old microservices, subscribing to the same message topics and producing the same outputs, while a reconciliation job compared results on every completed shipment. Only after a two-week run of zero reconciliation discrepancies did the old services' subscriptions get turned off.
Outcome
- Cross-service call volume on the critical path reduced by 71%
- Median feature-ship time on pricing fell from 18 days to 4
- Kubernetes node count for the two consolidated contexts cut by a third
- Zero reconciliation discrepancies observed in the final two weeks of parallel run
- Client's platform team adopted the boundary model for the remaining four contexts and completed the carrier and invoicing consolidations on their own within the following nine months
Stack
Go 1.21 · gRPC · CockroachDB · Temporal · OpenTelemetry · NATS