A 45-minute timeboxed playbook for attacking any system design question — from "design Twitter" to "design Bitly" — without panicking, wandering, or over-engineering.
Picture this. Sarah sits down for her L5 system design interview. The interviewer takes a sip of coffee and says, "Design Twitter." Sarah nods, opens the whiteboard tool, draws a box labeled "Web Server", then a box labeled "Database", connects them with a line — and freezes. Twenty minutes later she has eight disconnected boxes, no requirements, no API, no numbers, and the interviewer is asking, "So… what about the feed?" Sarah panics, mumbles "Redis", and the loop is over before she's said anything substantive.
The problem isn't that Sarah doesn't know the components. She knows what Redis does, she knows what Kafka does, she's read about consistent hashing twice. The problem is she has no playbook for the 45 minutes — no order in which to think, no checklist for what must be on the board before time runs out. HLD interviews aren't a memory test; they're a structured-thinking test, and structure is what a framework gives you.
This page lays out the playbook used by senior engineers at FAANG, distilled from the methodology hellointerview.com teaches, and shows you how to apply it minute-by-minute. By the end you should be able to walk into any HLD loop, hear the prompt, and know exactly what you're going to say in the next 45 minutes.
Before learning how to deliver, understand what's actually being measured. Interviewers at most senior engineering ladders are scoring four dimensions on every HLD loop. Knowing them lets you spend your minutes on what matters.
Can you take an ambiguous prompt — "design Twitter" — and break it into a prioritized set of concrete problems? Do you know which 3 features are the spine of the system and which 50 are noise? This is where you earn points for asking "who's the user, what's the top action, what scale are we at?" instead of jumping straight to boxes.
What it looks like in practice: the candidate restates the problem in their own words, lists 5 candidate features, and explicitly says "for this interview I'll focus on these 3 — does that match what you want to see?"
Can you map core distributed-systems concepts (sharding, caching, replication, queues) onto the specific components your system needs? "We use Kafka because we have asynchronous fan-out from one writer to many consumers" is solution design. "We use Kafka because Kafka is good" is not.
What it looks like: every box on the diagram is justified by a requirement from Step 1. The candidate can answer "what would break without this?" for every component.
Do you know the current tooling and patterns? "I'd use a sharded Postgres with logical replication" is current. "I'd use Hadoop MapReduce for the feed" was current in 2010 — today it signals you haven't kept up. Specifically: knowing when DynamoDB beats Cassandra, when Kafka beats Kinesis, when CRDTs beat last-writer-wins, when gRPC beats REST.
What it looks like: the candidate names specific products, specific data structures inside those products, and explains the trade-off versus the obvious alternative.
Can you explain your reasoning out loud, respond to interviewer pushback without getting defensive, and revise the design when challenged with a new constraint? The interviewer is also scoring "would I want to be on a design review with this person?"
What it looks like: the candidate narrates as they draw, pauses to ask "is this the depth you want, or should I move on?", and when the interviewer says "what about hot keys?" they incorporate the concern instead of defending the original design.
Here's the entire playbook on one page. Five distinct phases, ordered by what a senior engineer would actually do at a real design review. Each phase has a hard timebox — the goal is not to finish early but to spend exactly the right amount of time on each step so you reach Deep Dives with the interviewer still engaged.
Requirements. Functional + Non-functional. Top 3 features only. Quantify NFRs.
Core Entities. Bulleted list of data nouns. User, Tweet, Follow. Fast.
API Surface. 3-5 endpoints. REST defaults. Auth-derived user_id.
High-Level Design. Walk one API at a time. Add components as needed. Build sequentially.
Deep Dives. Address NFRs and bottlenecks. Hot keys, sharding, caching, queues. Let the interviewer probe.
Q&A. Interviewer's choice. Reverse questions. Recap.
The single most undervalued step. Candidates who skip it end up designing a system the interviewer didn't ask for, and the interviewer's gentle "interesting, but what about X?" lands like a wrecking ball 25 minutes in. Five minutes spent here saves twenty minutes later.
List the top 3 user-facing capabilities. Resist the urge to brainstorm a list of 20. The interviewer doesn't have time to design 20 features in 45 minutes — they want to see depth on the few that matter. Phrase each as a user action, not a system capability.
Three actions, each one drives a clear architectural slice. Post → write path. Follow → graph storage. Feed → fan-out problem.
You'll spend 30 minutes just listing endpoints and have no time left for architecture. The interviewer will start cutting features by force.
NFRs are where most candidates fumble. The temptation is to list adjectives — "scalable, available, performant" — none of which mean anything specific. The fix: quantify everything, and tie each to a CAP-style trade-off.
The standard NFR checklist to walk through out loud: CAP Latency Throughput Availability Durability Consistency Security Compliance Cost. You don't need all of them on every system, but mentioning each shows you've considered it.
Back-of-envelope math is a tool, not a checkbox. Do it only when the number influences an architectural choice: does the data fit on one box, or do we need sharding? Is QPS within one DB's limits, or do we need read replicas + cache? Is bandwidth at CDN-scale, or can a single LB handle it?
Forces the interviewer to cut for you. Pick the 3 hardest and hardest-coupled. The interviewer can always ask for more.
Without quantified NFRs you can't justify any architecture choice — every "we need a cache" answer is hand-waved.
Estimating bytes when you're not going to use the number. Burns 3 minutes that could've gone to deep dives.
The shortest step in the framework, and the easiest to get right. List the data nouns the system manipulates. That's it. No fields yet, no relationships, no schema — just names. This step exists to establish vocabulary with the interviewer so when you say "Tweet" later, you both mean the same thing.
Why bother? Three reasons:
When you draw the database in Step 4, you'll point at "the Tweet table" — the interviewer already knows what that means.
If you can't list the entities, you don't understand the problem. Two minutes here exposes confusion early.
"Post" might mean a tweet, a blog post, or an HTTP verb. Naming the entity locks in the meaning.
Now you commit to the contract between the client and your system. APIs make the system concrete in a way that abstract architecture diagrams never can — they force you to answer "who calls what, with what payload, and what comes back?" Every box you draw in Step 4 will be in service of one of these endpoints.
Unless the system is real-time streaming (live location, chat, video), default to REST. Plural resources, standard HTTP verbs, JSON payloads. Three to five endpoints — one per top functional requirement plus a couple of supporting reads.
Twitter — top 3 endpoints driving top 3 functional requirements// Post a tweet — write path POST /v1/tweets Headers: { Authorization: "Bearer <jwt>" } Body: { "text": "hello world", "media_ids": ["m_42"] } → 201 { "tweet_id": "t_8910", "created_at": "2026-05-07T14:02:06Z" } // Follow another user POST /v1/follows Headers: { Authorization: "Bearer <jwt>" } Body: { "followee_id": "u_777" } → 204 No Content // Get the home feed — read path, the heavy lift GET /v1/feed?cursor=<opaque>&limit=20 Headers: { Authorization: "Bearer <jwt>" } → 200 { "tweets": [ { "tweet_id": "t_…", "author": "…", "text": "…", "created_at": "…" }, … ], "next_cursor": "<opaque>" }
POST /v1/tweets { "user_id": "u_42", "text": "..." } just designed a vulnerability where Mallory can post tweets as Sarah. The user_id is implicit — derived server-side from the JWT/session — and it's worth saying this out loud as you write the endpoints. Interviewers love this catch.Live chat, live location (Uber), real-time notifications. Polling at 1Hz on REST would crush the server.
Internal service-to-service calls where the JSON overhead and HTTP/1.1 RTT matter. Less common for the public API surface.
Not really an "API" — but worth saying "the email-send is dropped on a Kafka topic, not exposed as an endpoint."
If you're designing an analytics pipeline, search indexer, or event-stream processor, the "API" is really a data flow. Draw producer → topic → consumer instead of REST endpoints. Same effect: locks down the contract.
GET /users/:id/followers, they'll ask — don't volunteer until they do.This is the visible centerpiece of the interview — the diagram the interviewer will photograph at the end. But the diagram itself isn't the deliverable: the narration as you build it is. Done well, this section walks through one API endpoint at a time, adding only the components that endpoint needs, until every box on the board has earned its place.
Don't pull a full distributed-systems architecture from memory and start drawing it. The interviewer can't follow your reasoning if you skip the building-up. Instead, take API #1 (the highest-volume or most-defining one), draw the simplest possible system that satisfies it, then move to API #2 and add only what's missing.
Notice how Pass 1 is almost embarrassingly simple — and that's the point. The interviewer sees you starting with the smallest thing that could possibly work, then layering complexity only where a specific requirement demands it. By Pass 3 you've explained why the cache exists ("feed reads dominate at 100:1") and why the fan-out worker exists ("we precompute feeds so reads don't pay the join cost").
When data flows through your system, say what gets written where. "On POST /tweets, we (1) insert into the Tweets table, (2) emit a tweet_created event to Kafka, (3) the fan-out worker picks it up and pushes the tweet_id into each follower's feed cache." That sentence is worth ten boxes — it shows the interviewer you understand the runtime, not just the static topology.
You don't need to enumerate every column in every table. Mention fields when they drive a design choice: "the Tweet table is keyed by tweet_id and indexed on (user_id, created_at) so we can efficiently fetch a user's recent tweets". The fact that there's a language column doesn't help the interviewer — skip it.
"For POST /tweets, the request hits the load balancer, gets routed to a stateless write app server, which inserts into the Tweets DB sharded by user_id. Then it emits a fan-out event to Kafka so feeds get rebuilt async — that way the user gets their 201 in under 100ms even if their follower list is 10M people."
"OK so we have a load balancer, and behind that we have app servers, and the app servers talk to a database, and we'll need a cache, and there's also a queue, and a search service over here, and..." [draws 12 disconnected boxes with no explanation of why each exists]
This is where senior candidates separate from mid-level candidates. By minute 25 you have a working high-level design — congratulations, that's the table-stakes deliverable. The remaining 15 minutes are about iterating the design to satisfy the non-functional requirements: latency, scale, fault tolerance, hot keys, edge cases. The interviewer is also probing for the depth of your knowledge — they have specific things they want to test, so leave room for them to drive.
In Step 4 you added components to support new endpoints. In Step 5 you add components (or modify existing ones) to fix weaknesses in the existing design. Walk through your NFRs from Step 1 and ask: "does the current design hit this? if not, what's the bottleneck?"
The candidate has just drawn a fan-out-on-write architecture: when Sarah tweets, the worker pushes her tweet into all 200 of her followers' feeds. Then the interviewer asks: "What happens when Taylor Swift tweets?"
The interviewer has a list of probes they want to test on every candidate. Maybe it's "ask about hot keys", "ask about consistency in the cache", "ask about how the schema handles deletions". If you spend 15 minutes monologuing your way through three deep dives they didn't pick, you fail their unstated checklist. The move: do one deep dive of your own choosing, then explicitly hand the steering wheel: "I could go deeper on the cache, or talk about partitioning, or address how we handle deletes — what's most useful?"
Every interviewer has a mental list of behaviors that signal "this candidate isn't ready." None of them are about technical knowledge — they're all process and communication failures. Avoiding them is half the loop.
Candidate hears "design Twitter", immediately draws Web Server → DB → Cache. The interviewer hasn't even said what the user can do yet. Without requirements, every architectural choice is an assumption — and you'll be defending phantom decisions for the rest of the loop.
Fix: first words out of your mouth are "let me make sure I understand the problem — what are the top user actions you'd like me to focus on?"
These are adjectives, not requirements. They don't constrain any decision because every system can claim them. The interviewer can't push back ("how scalable?") and so can't grade your reasoning.
Fix: attach a number or a CAP-side to each. "Available — 99.99% on reads, 99.9% on writes. Reads can serve stale data, writes must be durable."
Redis what? A string? A hash? A sorted set? With what eviction policy? Sharded how? "Use Redis" is a meaningless phrase — it's like saying "we'll use a computer." Every storage choice has a data structure and an access pattern; name them.
Fix: "Cache the feed in Redis as a sorted set keyed by user_id, score = tweet timestamp, capped at 800 entries per user, evicted via LRU at the cache-node level."
"NoSQL because it scales" is a 2014 answer that interviewers now actively flag. Modern Postgres scales to terabytes; modern DynamoDB has transactions. The choice depends on access pattern (key-value vs relational queries), consistency needs, and operational maturity — not on a one-line slogan.
Fix: "Key-value access pattern at billion-row scale with no joins → DynamoDB. If we needed multi-row transactions and complex reporting, I'd revisit Postgres."
Candidate draws Kafka because "you always need a queue." But there's no async work in the design — every endpoint is request/response. The Kafka box now needs explanation, takes board space, and signals you cargo-cult components without justifying them.
Fix: for every box, be ready to answer "what would break without this?" If the answer is "nothing", erase it.
Interviewer says "what about hot keys?" and the candidate, mid-monologue, says "yeah, I'll get to that, but first let me explain the cache topology…" — and then never gets to it. Probes are gifts. They tell you exactly what the interviewer wants to hear.
Fix: when probed, stop, address the probe directly, then return to your thread. "Good question — let me handle hot keys now and come back to the topology."
Of all the things that separate a junior-leveled answer from a senior-leveled one, the single biggest is specificity. Junior candidates name a tool; senior candidates name the tool, the data structure inside it, the access pattern, the partitioning, the failure mode, and the fallback. Same five words, ten times the signal.
This is a wish, not a design. Where? In what data structure? Keyed by what? Evicted how? Sharded across how many nodes? Read-through or write-through?
"I'll store tweet_id → Tweet as a Redis HASH on a 12-node cluster sharded by tweet_id via consistent hashing. 256GB per node, LRU eviction, replicated 2× for fault tolerance. Read-through from the app layer with a 60-second negative cache for missing keys to avoid stampede on deleted tweets."
Which queue? What's the throughput? What partitioning gives you ordering guarantees? What happens when a consumer crashes mid-message?
"Kafka topic notifications.email, 32 partitions keyed by user_id so per-user notifications stay ordered. 3-replica fault tolerance with min.insync.replicas=2 for durability. 7-day retention so we can replay if a downstream consumer regresses. Consumer group with at-least-once semantics; the email-sender is idempotent on (user_id, notification_id)."
By what key? Range or hash? How many shards? What happens when you add a shard? Cross-shard queries?
"Shard the Tweets table by user_id using consistent hashing on a 16-virtual-node ring across 8 physical shards, replicated 3× across AZs. user_id as the shard key co-locates all of one user's tweets, which makes the timeline-by-user query a single-shard read. Cross-shard 'global timeline' queries scatter-gather, which is acceptable because the home feed isn't built that way — it's precomputed from the fan-out path."
Below is a minute-by-minute script of an idealized candidate applying the framework to "Design Bitly." Read it as the tempo and tone you should aim for — concise, structured, narrating the framework out loud as you go.
POST /v1/urls with the long URL in the body, returns the short URL. GET /:hash returns a 302 redirect. user_id derived from auth token, never from the body."Use this table as a study guide. Each existing HLD page on this site emphasizes different framework steps and showcases different patterns. When you're practicing, pick a page based on what step you want to drill.
| HLD Page | Framework Step Emphasized | Patterns Showcased | Best For Practicing |
|---|---|---|---|
| URL Shortener | Step 4 (HLD) + Step 5 (KGS deep dive) | Cache 80/20, KGS pre-generation, consistent hashing, CDN edge cache | Read-heavy systems & key-generation problems |
| Dropbox | Step 4 (control vs data plane split) | Block-level dedup, metadata vs content split, sync protocol | Storage systems & the "two-plane" mental model |
| Distributed UUID | Step 4 + Step 5 (collision deep dive) | Snowflake IDs, time-bit packing, clock skew handling | ID generation & coordination-free design |
| LeetCode | Step 4 (sandbox isolation) + Step 5 (queueing) | Container sandboxing, judge queue, real-time results via WebSocket | Compute-heavy systems & isolation patterns |
| Median of Billions | Step 1 (problem decomposition) | Approximate algorithms, t-digest, sketch data structures | Algorithm-flavored HLDs & capacity reasoning |
Read the URL Shortener HLD end-to-end. Then close it and re-derive the architecture from just the requirements. Time yourself — aim for 45 minutes.
Drill Dropbox (control/data plane split) and Distributed UUID (coordination-free design). Both teach mental models that transfer across many systems.
LeetCode and Median of Billions force you outside the standard CRUD pattern. Use them to practice Step 5 — handling unusual workloads under realistic constraints.