Skip to content
Back to Insights
Consultancy Architecture Software Audit Engineering Leadership

What an Architecture Audit Actually Looks Like (And When You Need One)

Companies usually ask for an architecture audit at one of three moments: before funding, before scaling, or after a near-miss in production. This is a walkthrough of what we actually do during a 4-week audit engagement — the deliverables, the artefacts, and the conversations no slide deck captures.

Codecanis Admin

9 min read
Architecture review session
Week-three findings session with the leadership team of a B2B SaaS we audited.

Companies usually call us for an architecture audit at one of three moments: a few weeks before a Series B due diligence, a quarter before a 10× traffic event they're not sure they'll survive, or the Monday after a production incident that almost ended someone's career. In every case, the founder or CTO has the same instinct — they want a third party to look at the system, tell them what's actually under the hood, and put the risks on the table in writing.

What they often don't know is what a real audit consists of. The market is full of people who will deliver a 40-slide PowerPoint full of generic best-practice bullet points and call it an audit. That is not what we do. This post is a walkthrough of what a four-week architecture audit engagement looks like at Codecanis — week by week, the artefacts we produce, and the conversations no slide deck captures.

When to Call for an Audit

The three triggers we see most often:

  • Pre-funding diligence: An investor or acquirer is going to ask hard technical questions. The CTO wants to be ahead of those questions, not surprised by them.
  • Pre-scale: A product is about to handle 5–20× more traffic, users, or transaction volume. The team suspects parts of the system won't hold, but they don't know which.
  • Post-incident: Something failed in a way that revealed structural fragility — a multi-hour outage, a data inconsistency, a security incident. Leadership wants a systemic view rather than another blameless post-mortem.

Less common but equally valid: a new CTO inherits an unfamiliar codebase, or two merged engineering teams need a baseline assessment of which platform to converge onto.

Week 1: Kick-Off, Access, and Listening

The first week is mostly listening. We deliberately resist diving into code on day one — without context, a code review produces opinions, not insights.

Kick-off Workshop

Day one is a half-day workshop with the engineering leadership team. We map out:

  • The product surface area (what does the system actually do?).
  • The known pain points — places leadership already suspect are weak.
  • The 12-month roadmap — because an architecture that's fit for today's product may not survive what's coming.
  • The constraints we should respect (regulatory, contractual, vendor lock-in).

Stakeholder Interviews

Over days two through five we run 45-minute structured interviews with 8–15 people: senior engineers, the on-call rotation, the head of product, the head of security, the head of customer support. Customer support is often the most informative — they hear about failures the engineering team has stopped seeing.

We ask the same five questions in every interview: what wakes you up at night? what would you fix first if you had two engineers for a quarter? what part of the system do you avoid touching? what surprised you most when you joined? what do you wish leadership understood about the codebase?

Access and Artefacts

By end of week one we have read access to: every Git repository, the infrastructure (read-only IAM in AWS/GCP/Azure), monitoring dashboards (Datadog, New Relic, Grafana), incident history (last 12 months of post-mortems), the deployment pipeline (CI/CD configs), and the existing architecture documentation — however incomplete.

Week 2: Code, Infrastructure, and Benchmarks

Week two is where two of our engineers spend full days inside the system. We split the work: one engineer focuses on application code and the development workflow, the other on infrastructure, data, and runtime behaviour.

Code Deep Dive

We don't read every file. We sample purposefully:

  • The 10 highest-churn files in the last 12 months (from Git history). High churn correlates strongly with bug density.
  • The boundary modules — wherever the system talks to a payment processor, an external API, or a third-party service.
  • The data access layer — ORM usage, raw SQL, N+1 patterns, missing indexes.
  • The authentication and authorisation paths — top-to-bottom.
  • The test suite, with particular attention to what is not tested.

We measure cyclomatic complexity, dependency depth, and test coverage as quantitative anchors — but the qualitative read of "what would it take a new engineer to understand this module" matters more than any single number.

Infrastructure Walkthrough

Parallel to the code review, we walk through the live infrastructure with whoever owns it. We check:

  • Network topology and trust boundaries.
  • Database configurations (replicas, backups, point-in-time recovery, encryption at rest).
  • Secret management (are credentials in environment files? Vault? AWS Secrets Manager?).
  • Observability — what's instrumented, what's not, and what would be needed to debug a production incident at 3am.
  • Disaster recovery posture: RPO/RTO commitments versus actual capability.

Performance Benchmarks

We run load tests against a staging environment if one exists, or against carefully isolated production endpoints if it doesn't. The goal isn't to find the breaking point — it's to find the point at which p99 latency starts climbing in a way that suggests an underlying constraint (database CPU, connection pool saturation, queue depth).

Week 3: Synthesis and the Risk Matrix

By start of week three we have hundreds of pages of notes. Week three is synthesis — turning observations into a structured assessment that leadership can act on.

The central artefact is the risk severity matrix. Each finding gets two scores: likelihood of triggering a real problem in the next 12 months, and impact if it does. We use this matrix:

SeverityDefinitionExample FindingRecommended Response
CriticalHigh likelihood, high impactProduction database has no point-in-time recovery; nightly backup onlyFix within 2 weeks
HighHigh likelihood, medium impact OR medium likelihood, high impactAuthentication service is a single instance with no failoverFix within 1 quarter
MediumMedium likelihood, medium impactTest coverage on payment module is 24%Plan for current half
LowLow likelihood OR low impactSome services use deprecated logging libraryTrack in backlog
InformationalNot a risk, but a recommendationConsider adopting OpenTelemetry for unified tracingDiscuss in roadmap planning

Mid-week we run a working session with the engineering leadership team — not a presentation but a working session. We walk through the draft matrix, challenge our own assumptions, and let the team push back. Findings often get reclassified, sometimes upward, sometimes downward, when we hear context we'd missed.

Week 4: Report, Walkthrough, and Roadmap

Week four is the delivery week. The audit produces three documents:

  1. Executive Summary (3–4 pages): Written for the CEO, board, or investor. Plain language, no jargon. The system's current state in one paragraph; the three things that matter most; the recommended response.
  2. Technical Report (40–80 pages): Written for the CTO and engineering leadership. Every finding, the evidence behind it, the risk classification, and a specific recommended fix with rough effort estimate.
  3. Prioritised Remediation Roadmap: A sequenced plan of what to fix in what order, with dependencies marked. Typically a six-month plan, with the first three months specified in detail and the next three in outline.

The final day is a two-hour leadership walkthrough. We present the findings, take questions, and — importantly — make ourselves available for the follow-up questions that always come 48 hours later when leadership has had time to digest.

What an Audit Costs (And What It's Worth)

A four-week audit by two senior engineers and a principal reviewer typically runs €38,000–€55,000. That's a real number for a small team, but it should be benchmarked against the alternatives: a failed funding round, a six-month outage post-mortem, or a rewrite that misses every deadline because nobody understood the existing system's actual constraints.

The most common ROI we hear back from clients: the audit pays for itself within a quarter, either because it prevented a single major incident, accelerated a funding round by shortening due diligence, or focused engineering effort on the changes that actually moved the needle rather than the ones that felt urgent.

Key Takeaways

  • An architecture audit is a structured four-week engagement, not a slide deck.
  • Week 1 is listening — stakeholder interviews are often more informative than the code itself.
  • Week 2 splits code and infrastructure review across two engineers working in parallel.
  • Week 3 synthesises findings into a risk severity matrix the leadership team can act on.
  • Week 4 delivers an executive summary, full technical report, and prioritised remediation roadmap.
  • The right time to commission one is before funding, before scaling, or after a near-miss — not during the crisis.
Let's build something

Want to work together?

If this article made you think about your architecture, your roadmap, or a problem you haven't solved yet — let's talk.