Case Study · Healthcare SaaS

Inara AI — collapsing 40 microservices into one right-sized modular monolith.

85–90% AWS cost reduction. Engineering team needed to run it cut by 75%. Deployment time hours → minutes. HIPAA Phase 7 plan ready.

Industry: Healthcare / Caregiving SaaS Stack: Java/Go/Node → Python FastAPI + Next.js + PostgreSQL Duration: 6 weeks (core re-engineering) Role: Solution Architect (lead)

85–90%

AWS cost reduction
$3.5–5.5K/mo → $340–545/mo

40 → 1

Services collapsed into a
single modular FastAPI monolith

75%

Smaller engineering team
needed to run the platform

The situation (why they called)

A 40-microservice platform — Java, Go, and Node — built for a "millions of users" design point but actually serving roughly 5,000 real users. AWS bill hovered between $3,500 and $5,500 per month. Deployments took half a day. The test pyramid was impossible to keep green. Multiple managed services were each adding integration tax: Auth0 for identity, Elasticsearch for search, a Redis cluster for caching, Kafka for events, Neo4j for relationship traversal, Ghost CMS for content, and both MySQL and PostgreSQL for persistence. The HIPAA story had real gaps the team knew about and dreaded auditing.

The diagnosis

Three observations changed the conversation:

The 5K user base was 200× below the architecture's design point. Most of the cost was overhead per service, not load.
7 of the 40 services together carried more than 80% of the actual traffic; the rest mostly talked to each other.
The HIPAA gap list was real: no field-level PHI encryption, no row-level security, audit logs scattered across services with no unified retrieval.

The decision

Decision Right-size first, comply second. Collapse the 40 services into a single FastAPI modular monolith with router-level boundaries preserved (so individual services can be re-extracted later if usage justifies). Replace heavy managed components with in-process equivalents wherever the trade-off makes sense.

The "in-process equivalents" decision was the most contentious. Here's the swap:

Was	Became	Why it works at this scale
Auth0 (identity)	JWT + bcrypt + pyotp	5K users; we don't need an OIDC provider for that.
Elasticsearch cluster	PostgreSQL `tsvector` + `pg_trgm`	Both Postgres extensions are mature; "good enough" search at this corpus size.
Redis cluster	In-process TTLCache	Single-process app; no need for distributed cache.
Kafka	`asyncio.Queue` + APScheduler	Event volume measured in hundreds/sec, not millions/sec.
Neo4j (graph)	PostgreSQL recursive CTEs	Care-network graph is <100K nodes; CTEs perform fine.
Ghost CMS	S3 + CloudFront	CMS pages are static; no editorial team needs Ghost's UI.

The delivery

Six weeks, AI-accelerated / spec-based ("vibe-coding") development. Old stack ran in parallel for the final week.

Week 1–2: discovery, modular monolith design, router map. Every old endpoint mapped to a new module + path.
Week 3–4: domain rewrite. In-process replacements for managed services landed. Test coverage on the critical paths.
Week 5: data migration scripts. RLS scaffolding for HIPAA Phase 7. Audit log unified into a single append-only table.
Week 6: production cutover behind a feature flag. One week of parallel run. Old stack decommissioned.

The outcome

Numbers AWS bill: $3.5–5.5K/mo → $340–545/mo (85–90% reduction).
Engineering team needed to run it: 4–8 → 1–2.
Deployment time: half a day → ~15 minutes.
HIPAA Phase 7 plan ready: PostgreSQL RLS, KMS field-level encryption, SAML/SCIM federation, external pen-test scheduled.
Platform now sellable as B2B: 45 feature flags, 7-role / 58-permission RBAC, per-tenant theming, white-label, Arabic L→R i18n.

What I'd do differently

The rewrite proved fast because we accepted "good enough" on test parity and ran old/new in parallel for a week as the real validation. If the team had been less trusting we'd have needed a 3-week parity test phase first — adding a quarter of the timeline. Worth knowing for the next engagement: parallel-run trust is itself a project asset, not just a technical setting.

Recognise any of this in your platform?

Book a Cost & Scale Audit All engagement tiers