Ticketmaster / BookMyShow

Read this with the framework in mind

This deep-dive applies the 4-step HLD interview framework. As you read, map each section to Requirements → Entities → APIs → High-Level Design → Deep Dives, and notice which of the 8 common patterns and key technologies are at play.

Framework → 8 Patterns → Tech Cheat Sheet →

Step 1

What is Online Ticket Booking?

It's 08:59 on a Friday morning. The new Marvel film opens at midnight. Sarah, sitting in Bangalore, has her finger hovering over the "buy" button on BookMyShow. So does Raj, two cubicles down. So do 50,000 other fans across the country, all targeting the same theater — PVR Forum, Screen 1 — which has exactly 200 seats. At 09:00:00 sharp the booking window opens. Within the next 0.4 seconds, every one of those 50,000 fans hits "select seat" — and many of them tap the same seat (the legendary J-12, dead center).

The system has to do four things, all at once, all correctly: browse (show movies, theaters, showtimes), select seats (let users pick from a live seat-map), pay (collect money via Stripe/Razorpay), and issue tickets (email a confirmation with a QR code). And it must do all of this with zero double-bookings — because if two people show up at PVR Forum holding tickets for J-12, the company gets sued and the brand dies. That's the system we're designing.

The two questions that drive every design decision below: (1) When 50K users tap the same seat in the same millisecond, how do we pick exactly one winner without double-booking? (2) How do we keep that winner's seat "held" for 5 minutes while they enter their card details, but instantly free it if they walk away — and let waiting users grab it fairly?

Step 2

Requirements & Goals

Before drawing a single box, pin down what the system must do. In an interview, asking these questions out loud signals you're building from first principles, not pattern-matching a memorized solution.

✅ Functional Requirements

List cities where the service operates
List movies currently running in a given city
List cinemas showing a given movie in that city
List shows (date + time + screen) for a chosen cinema
Render the seat-map of a chosen show with live availability
Let users select and hold seats for 5 minutes while paying
Atomic multi-seat orders — all seats book together or none do
Fair waiting queue when seats are sold out and may free up

⚙️ Non-Functional Requirements

Highly concurrent — many users bidding for the same seat
ACID-compliant — no double-bookings, no lost payments, no partially-paid orders
Highly available — release-day traffic spikes are 100× normal
Low latency on browse — the seat-map should render in under 200ms

🚫 Out of Scope

User authentication & identity (assume an existing auth service)
Cinema/theater management portals
Recommendations & personalization

The non-functional requirements are the harder ones. Listing movies is a database query. Letting 50K users contend for one seat without double-booking — and doing it fairly — is the part that actually requires architecture.

Step 3

Design Considerations

Four hard constraints shape this whole system. Skip any one of them and the design falls apart in production.

🧩 Atomic multi-seat orders

If Sarah wants 4 seats together for her family, the system must book all 4 or none. Booking 3 of 4 and then failing on the fourth leaves her with three useless tickets and a missing family member. This is a textbook ACID transaction — the kind a relational DB does effortlessly and a NoSQL store fights you on.

⚖️ Fairness during sellouts

When the show is full and someone abandons their hold, the freed seat shouldn't go to whoever happens to refresh first — it should go to whoever has been waiting longest. Without explicit fairness, the experience devolves into a refresh-button arms race that rewards bots and punishes patient users.

🚫 Limit seats per booking

Cap orders at 10 seats. Higher caps invite scalpers who buy 200 seats and resell them at 5× face value. The cap, combined with rate-limiting by user/payment-card, makes large-scale scalping operationally painful.

📈 Surge during popular releases

Average traffic might be 1K req/s; release-day traffic for a Marvel/Avengers premiere can be 50K req/s for the first 60 seconds. The system must scale horizontally on demand, and the booking path must not melt under burst load.

Step 4

Capacity Estimation

Numbers drive every architectural choice. Out loud, even if rough. The system is read-heavy on browse (millions of people checking showtimes) but write-coordinated on book (a smaller number of high-stakes transactions).

Traffic estimates

Assume 3 billion page views per month across browse paths (city → movie → cinema → show → seat-map). Of those, roughly 10 million tickets sold per month — about a 300:1 browse-to-book ratio.

Browse

~1.2K req/s avg

3B / (30 × 86400)

Bookings

~4 req/s avg

10M / (30 × 86400)

Peak browse

~50K req/s

40× spike on release

Peak bookings

~200 req/s

seat-contention burst

Storage estimate

Per day across the catalog: 500 cities × 10 cinemas/city × 2,000 seats/cinema × 2 shows/day × 100 bytes/seat-row ≈ 2 GB/day. Over 5 years, including bookings, payments, and audit trails: ~3.6 TB total. With 70% headroom: ~5 TB provisioned.

Metric	Value	Why it matters
Browse req/s (peak)	`50K/s`	Drives cache size and read-replica fan-out
Booking req/s (peak)	`200/s`	Drives DB write tier and isolation strategy
5-yr storage	`3.6 TB`	Fits a single MySQL cluster with read replicas
Tickets sold / month	`10M`	Drives notification, payment, audit volume
Seats per booking	`≤ 10`	Anti-scalping cap; cap shapes lock granularity

Step 5

System APIs

Two endpoints carry the interesting load: search (find what you want to watch) and reserve (claim seats while paying). A third endpoint completes the booking after payment success. Defining the contract early locks down the architecture before the first box is drawn.

REST API surface

// Search — read path, high QPS
GET /api/v1/search
{
  "api_key":     "abc123...",
  "keyword":     "Marvel",          // optional movie name fragment
  "city":        "Bangalore",       // filter by city
  "lat_long":    "12.97,77.59",     // optional, for nearby cinemas
  "radius_km":   10,                // search radius from lat_long
  "datetime":    "2026-05-08T18:00",
  "postal_code": "560001",          // alternative to lat_long
  "sort":        "showtime"         // showtime | rating | distance
}
→ 200 OK  { "results": [{ movie, cinema, show_id, showtime, available_seats }, ...] }

// Reserve — write path, the contention hot spot
POST /api/v1/reserve
{
  "api_key":    "abc123...",
  "session_id": "sess-9d4f...",     // sticky session for checkout
  "movie_id":   "mov-1234",
  "show_id":    "show-7891",
  "seats":      ["J-12", "J-13"]    // 1..10 seats
}
→ 200 OK   { "reservation_id": "res-...", "expires_at": "...+5min", "amount": 480 }
→ 409 Conflict  { "error": "seats_taken", "alternatives": ["J-14","J-15"] }
→ 429 Queued    { "error": "show_full", "queue_position": 47, "etr_seconds": 180 }

// Confirm — called after payment succeeds
POST /api/v1/confirm
{ "reservation_id": "res-...", "payment_token": "tok_..." }
→ 201 Created  { "booking_id": "bk-...", "tickets": [{seat, qr_code}, ...] }

Why three calls instead of one big "buy"? Because reserving a seat (claiming it for 5 minutes) is a fundamentally different operation from paying for it (charging a card). Reserving is fast, frequent, and reversible. Paying is slow (Stripe takes 2-3 seconds), unreliable (cards decline, networks drop), and irreversible. Splitting them lets the contended write — the seat lock — finish in milliseconds while the slow payment runs separately, with the lock auto-expiring if payment fails.

Step 6

Database Schema

The data model is wide — about 10 entities — but every interesting interaction touches the same handful: Show, Show_Seat, Booking, Payment. Three observations: (1) we have strong relational ties (a Show belongs to a Cinema_Hall, a Show_Seat belongs to a Show and a Cinema_Seat), (2) we need multi-row atomic transactions for multi-seat orders, and (3) the UNIQUE constraint on a Show_Seat row is what physically prevents double-booking. All three points push us toward a relational store like MySQL/PostgreSQL.

erDiagram CITY { string city_id PK string name string state string zipcode } CINEMA { string cinema_id PK string name int total_halls string city_id FK } CINEMA_HALL { string hall_id PK string name int total_seats string cinema_id FK } CINEMA_SEAT { string seat_id PK int seat_number string seat_type string hall_id FK } MOVIE { string movie_id PK string title string language string genre int duration_minutes } SHOW { string show_id PK timestamp start_time string movie_id FK string hall_id FK } SHOW_SEAT { string show_seat_id PK string show_id FK string cinema_seat_id FK string status decimal price string booking_id FK } BOOKING { string booking_id PK string user_id FK string show_id FK int num_seats string status timestamp created_at timestamp expires_at } PAYMENT { string payment_id PK string booking_id FK decimal amount string status string provider_ref } USER { string user_id PK string name string email } CITY ||--o{ CINEMA : "has" CINEMA ||--o{ CINEMA_HALL : "contains" CINEMA_HALL ||--o{ CINEMA_SEAT : "has" CINEMA_HALL ||--o{ SHOW : "hosts" MOVIE ||--o{ SHOW : "scheduled as" SHOW ||--o{ SHOW_SEAT : "has" CINEMA_SEAT ||--o{ SHOW_SEAT : "instantiated as" USER ||--o{ BOOKING : "places" SHOW ||--o{ BOOKING : "for" BOOKING ||--o| PAYMENT : "settled by" BOOKING ||--o{ SHOW_SEAT : "claims"

The SHOW_SEAT row is the heart of the system. Its status column is a small state machine — FREE, HELD, or BOOKED — and a UNIQUE constraint on (show_id, cinema_seat_id) means the database itself rejects any attempt to create two rows for the same seat in the same show. Combined with row-level locking, this is what physically prevents double-booking even under massive contention.

Why relational beats NoSQL here: our most important operation is "atomically claim 4 specific rows in show_seat or fail" — that's a textbook multi-row transaction with strong isolation. NoSQL stores either give you per-row atomicity (not enough) or eventual consistency (catastrophic for ticket booking). MySQL or Postgres with SERIALIZABLE isolation handles this in 5-line SQL. Our 3.6TB also fits comfortably in a sharded MySQL cluster, so we're not paying for NoSQL's horizontal-scale superpower.

Step 7 · CORE

High-Level Architecture — From Naive to Production

This is the section that wins or loses the interview. We'll build the architecture in three passes: the simplest thing that could plausibly work, why it falls apart at scale, and the production shape where every box justifies itself.

Pass 1 — The naive design (and why it breaks)

Sketch the simplest possible system: a few stateless app servers behind a load balancer talking to one MySQL. To book a seat, the app does INSERT INTO booking .... Done.

flowchart LR C["Client"] --> LB["Load Balancer"] LB --> APP["App Server"] APP --> DB[("MySQL")]

Three failures emerge the moment a real movie release hits this design:

💥 Two clicks, same seat — both succeed

Sarah and Raj both tap seat J-12 at 09:00:00.001. App server A handles Sarah, app server B handles Raj. Both run INSERT INTO booking at the same instant. Without row-level locking and a UNIQUE constraint, both inserts succeed — and now two physical humans hold a ticket for the same chair. The brand is dead.

💥 30-minute holds block everyone

A user clicks "select seats" then walks away to make tea. If the seat sits in HELD state for 30 minutes with no automatic expiry, that's 30 minutes of every other fan seeing it as taken. With 200 seats and 50K interested fans, the show looks "sold out" within seconds even though most "holds" never become bookings.

💥 50K req/s on release morning melts MySQL

A single MySQL handles ~5K simple writes/sec. A flash-sale spike of 50K req/s on the booking path pegs the DB CPU at 100%, p99 latency goes from 5ms to 5 seconds, and connection pools start dropping requests. The whole site stops responding — not just the popular movie.

Pass 2 — The mental model: a reservation is an in-memory, time-bounded promise

Here's the central insight that reshapes the whole design: a seat reservation is two things at once — (a) a persistent claim in the database (so we never double-book and so it survives a crash), and (b) a 5-minute timer living in memory (so we know when to expire it and let someone else have a turn). One alone isn't enough.

Think of it like a restaurant that takes a name on the waitlist and sets a 30-minute timer on the host stand. The name on paper is durability — even if the host gets fired, the next host can pick up. The timer is liveness — without it, no-shows squat on tables forever. Booking systems need both. So we split the work across two cooperating services:

📖 Browse Plane

~50K req/s peak. Read-only fan-out — list cities, movies, cinemas, shows, seat-maps. Tolerates stale data (a 2-second-old seat-map is fine). Cacheable to the moon. No locks, no transactions.

💳 Booking Plane

~200 req/s peak. The contended write path. Transactional, ACID, row-locked. Must be fast (sub-100ms) to absorb burst contention. Holds the 5-minute timers for in-progress reservations.

⚖️ Async Plane

Background. The fairness queue, expiry timers, payment callbacks, notifications. Wakes the longest-waiting user when a seat frees up. Decoupled so a stuck SMS provider can't slow checkout.

That three-plane split lets us scale browse on cheap stateless replicas, keep the booking path narrow and ACID-strict, and bury the messy "who gets the freed seat?" question inside an ActiveReservationsService (timers) and a WaitingUsersService (the deli-counter ticket queue) running in the async plane.

The deli-counter analogy: at the ActiveReservationsService, every held seat has a 5-minute kitchen timer ticking down. At the WaitingUsersService, every fan who showed up after the show went "full" has a numbered ticket in their hand. When a kitchen timer rings, we expire that hold and shout the next ticket number — first ticket pulled, first seat offered. That's how you do fairness when 100 fans want the same 10 seats.

Pass 3 — The production shape

Now the full picture. Every node is numbered — find its matching card below to see what it does and crucially what would break without it.

flowchart TB CL["① Client — Web · Mobile App"] subgraph EDGE["Edge Tier"] LB["② Load Balancer — sticky sessions"] end subgraph BROWSE["Browse Plane"] SS["③ Search Service"] CDB[("④ Show Catalog DB — Cassandra")] CACHE["⑪ Cache — Redis"] end subgraph BOOK["Booking Plane"] BS["⑤ Booking Service"] BDB[("⑥ Bookings DB — MySQL ACID")] AR["⑦ ActiveReservationsService"] end subgraph ASYNC["Async Plane"] WU["⑧ WaitingUsersService"] PS["⑨ Payment Service"] NS["⑩ Notification Service"] end CL --> LB LB -->|"GET browse"| SS LB -->|"POST reserve / confirm"| BS SS --> CACHE CACHE -.miss.-> CDB BS --> BDB BS --> AR BS --> PS AR -.expiry.-> WU WU -.notify next.-> NS PS -.success.-> BS BS -.confirm.-> NS style CL fill:#e8743b,stroke:#e8743b,color:#fff style LB fill:#171d27,stroke:#9b72cf,color:#d4dae5 style SS fill:#171d27,stroke:#4a90d9,color:#d4dae5 style CDB fill:#171d27,stroke:#4a90d9,color:#d4dae5 style CACHE fill:#171d27,stroke:#3cbfbf,color:#d4dae5 style BS fill:#171d27,stroke:#e8743b,color:#d4dae5 style BDB fill:#171d27,stroke:#38b265,color:#d4dae5 style AR fill:#171d27,stroke:#9b72cf,color:#d4dae5 style WU fill:#171d27,stroke:#9b72cf,color:#d4dae5 style PS fill:#171d27,stroke:#d4a838,color:#d4dae5 style NS fill:#171d27,stroke:#d4a838,color:#d4dae5

Component-by-component — what each numbered box does

Use the numbers in the diagram above to find the matching card. Each one answers what is this, why is it here, and what would break without it.

① Client (Web / Mobile)

The browser tab or mobile app a user is staring at. It walks the user through the funnel: pick a city, pick a movie, pick a showtime, watch the seat-map render, tap seats, hit "pay", enter card details, see the QR code. From the client's view the entire system is one HTTPS endpoint — but every interesting concern (latency, fairness, idempotency) is shaped by what the client does next.

Solves: nothing on its own — but every architectural decision flows backward from "what does the user see and how does it feel?" Sticky sessions, idempotency keys, and reservation timers are all about not breaking the client's mental model of "I clicked a seat, it's mine."

② Load Balancer

The front door. Sits in front of all stateless services, distributes incoming HTTPS, terminates TLS, and yanks unhealthy backends out via 5-second health checks. Crucially we use sticky sessions during checkout — once a user has a reservation in flight, the LB pins them to the same Booking Service node so the in-memory session state (selected seats, payment intent) doesn't have to be re-fetched on every click.

Solves: single-point-of-failure on app servers, plus the "lost cart" problem if requests bounce between nodes mid-checkout. Without the LB, a single pod crash takes down the site. Without sticky sessions, a user who refreshes mid-checkout might land on a fresh node that has no idea they're holding 4 seats.

③ Search Service

Stateless service that answers all the browse queries: "what cities?", "what's playing in Bangalore tonight?", "what are the showtimes for Marvel at PVR Forum?". It hits the cache first, falls through to the catalog DB on miss. Read-only, so it scales horizontally — add pods until the LB stops complaining.

Solves: isolating the read-heavy browse workload from the contended booking path. Without a dedicated search service, the same pods serving 50K browse req/s would also be running ACID seat-locks, and the locks would starve under read pressure.

④ Show Catalog DB (Cassandra)

The denormalized read-store for the browse plane. Holds movies, cinemas, halls, showtimes, and a snapshot of seat-availability counts per show (not the live row-level state). Cassandra is right here because the data is read-heavy, eventually-consistent, and replicates naturally across regions — a fan in Mumbai shouldn't pay a 200ms round-trip to a primary in Delhi just to see what's playing tonight.

Solves: the 50K-req/s browse spike. Without a denormalized read store, every "list movies in city" would join across movie, show, cinema, cinema_hall on the booking DB — exactly the joins the booking DB is too busy doing transactions to serve.

⑤ Booking Service

Stateless service that handles POST /reserve and POST /confirm. The reserve flow per request: validate input → start a SQL SERIALIZABLE transaction → SELECT ... FOR UPDATE on the requested show_seat rows → if all are FREE, mark them HELD with a 5-minute expiry, register the timer with ActiveReservationsService, COMMIT → return reservation ID. Total budget: under 100ms.

Solves: the contended write path. Without a dedicated booking service, locking semantics would be sprinkled across whatever node happens to handle the request, and you'd never get the discipline needed to prevent double-bookings under 50K req/s.

⑥ Bookings DB (MySQL)

The source of truth for everything that costs money — bookings, payments, the show_seat rows whose status is HELD or BOOKED. MySQL with InnoDB row-locks and SERIALIZABLE isolation. The UNIQUE constraint on (show_id, cinema_seat_id) in the show_seat table is what physically guarantees no double-booking — even if every other line of code had a bug, the DB would refuse the second insert.

Solves: correctness under contention. Without a relational ACID store, you'd be re-implementing SERIALIZABLE in application code on top of an eventually-consistent NoSQL — a battle no team has ever won at this scale.

⑦ ActiveReservationsService

An in-memory map of {show_id → list of (reservation_id, expires_at)} for every reservation that's currently HELD but not yet paid. When a reservation is created, the booking service registers its 5-minute timer here. When the timer rings, this service atomically transitions the seat back to FREE in MySQL and notifies WaitingUsersService "a seat just opened on show X". Backed by the DB — on crash, recover by reading all HELD rows.

Solves: liveness. Without a dedicated timer service, expiring abandoned reservations means polling the DB for "rows where expires_at < now" — a giant scan that gets slower as the table grows. The in-memory timer wheel is O(1) per expiry.

⑧ WaitingUsersService

Per-show FIFO queue of users waiting for seats to free up after a sellout. When the show is full and a fan tries to reserve, they're appended to the queue with a notification handle. When ActiveReservationsService announces "seat freed on show X", this service pops the head of the queue, sends them a "your seats are available — complete checkout in 5 min" push, and starts another 5-minute timer. If they don't claim, they expire out and we try the next person. This is the deli-counter analogy made literal.

Solves: fairness during sellouts. Without an explicit queue, freed seats go to whoever happens to refresh their browser at the right millisecond — which favors bots and frustrated power-refreshers, not the patient fan who's been waiting 20 minutes.

⑨ Payment Service

Wraps Stripe / Razorpay / native UPI integrations. Booking Service calls it with the reservation ID, the amount, and a payment token from the client. Returns success/failure asynchronously via webhook. Holds an idempotency key so retries don't double-charge a card. On success it calls back into Booking Service to flip the reservation from HELD to BOOKED.

Solves: isolating the slow, flaky external dependency from the fast in-process reservation logic. Without a separate service, Stripe's 3-second p99 latency would balloon every reservation request to 3 seconds — and Stripe's occasional outages would directly take down booking.

⑩ Notification Service

Async fan-out to email, SMS, and push. Sends the "you reserved 4 seats — pay in the next 5 minutes" warning, the "your seats are available" wake-up to waiting users, and the final "here's your QR code" confirmation. Subscribes to events from Booking Service via Kafka — never inline on the request path, so a flaky SMS provider can't slow checkout.

Solves: communication without coupling. Without an async notification service, every booking would block on "send SMS" — and SMS gateways have multi-second tail latencies.

⑪ Cache (Redis)

An in-memory key-value store holding (a) show metadata that almost never changes (movie title, cinema name, showtime), and (b) seat-map snapshots — a compact bitmap per show showing which seats are FREE/HELD/BOOKED, refreshed every 1-2 seconds. Read flow: search service hits Redis first, falls through to Cassandra on miss.

Solves: the 50K browse req/s spike. Without a cache, the seat-map render alone (every fan opens 5 seat-maps before settling on one) would push the catalog DB past its read ceiling. With it, 95%+ of seat-map renders never reach the DB.

Concrete walkthrough — Sarah and Raj race for J-12

Two real flows, mapped to the numbered components above. The first shows concurrency resolution; the second shows expiration and fairness.

⚔️ Concurrent click — Sarah vs. Raj at 09:00:00.001

Both Clients ① POST /reserve with seat J-12. Both hit the Load Balancer ②, which pins each user to a Booking Service ⑤ pod (sticky session).
Both pods open a transaction on Bookings DB ⑥ at SERIALIZABLE isolation: BEGIN; SELECT * FROM show_seat WHERE show_id=X AND seat='J-12' FOR UPDATE;
InnoDB grants the row lock to whichever transaction physically arrived first — say Sarah's, by 0.3ms. Raj's transaction now blocks waiting for the lock.
Sarah's pod sees status=FREE, updates to HELD, registers the 5-minute timer with ActiveReservationsService ⑦, and COMMITs. Lock released.
Raj's pod re-reads the row, sees status=HELD, ROLLBACKs and returns 409 Conflict — alternatives: J-11, J-13, J-14.
Sarah's UI shows "5:00 to complete payment". Raj's UI shows "J-12 just got taken — try one of these".

⏰ Sarah doesn't pay — fairness kicks in at 09:05:01

Sarah got distracted. Her 5-minute timer in ActiveReservationsService ⑦ rings.
The service runs a single transaction: UPDATE show_seat SET status='FREE', booking_id=NULL WHERE reservation_id=... — atomic transition back to FREE.
It publishes "seat freed: show X, seat J-12" to WaitingUsersService ⑧.
WaitingUsersService pops the head of the per-show queue — that's Priya, who's been waiting since 09:01:30.
It calls Notification Service ⑩ to push Priya: "your seat is available — complete checkout in 5 min", with a deep link.
Priya taps the link, lands on a fresh reservation flow with J-12 pre-held in her name. The 5-minute timer restarts.
If Priya also abandons, the cycle repeats with the next queued fan. First-come-first-served, enforced.

So what: the architecture is built around three insights — (1) browse and book are different shapes so they get different planes (read replicas vs. ACID master); (2) a reservation is both a DB row and an in-memory timer, requiring two cooperating services; (3) fairness during sellouts is a queue problem, not a refresh-button race. Every box in the diagram earns its place by killing a specific failure mode from Pass 1.

Step 8

Concurrency & Isolation

This is the section interviewers probe hardest. Get the isolation level wrong and you ship double-bookings; get it too strict and the site goes unresponsive on release morning. The right answer for ticket booking is SERIALIZABLE isolation with row-level SELECT ... FOR UPDATE — and the trade-offs need to be on the tip of your tongue.

The four SQL isolation levels — quickly

Level	Dirty read	Non-repeatable read	Phantom read	Throughput
READ UNCOMMITTED	❌ allowed	❌ allowed	❌ allowed	highest
READ COMMITTED	✅ blocked	❌ allowed	❌ allowed	high
REPEATABLE READ	✅ blocked	✅ blocked	❌ allowed	medium
SERIALIZABLE	✅ blocked	✅ blocked	✅ blocked	lowest

For booking, we need all three anomalies blocked. A "phantom read" — the seat appears FREE in our SELECT but a concurrent transaction inserted a competing booking before our INSERT lands — is the literal definition of a double-booking. So we go to the top: SERIALIZABLE.

The reservation transaction in 6 lines of SQL

SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
BEGIN;
  SELECT seat_id, status FROM show_seat
   WHERE show_id = 'show-7891' AND seat IN ('J-12','J-13')
   FOR UPDATE;                  -- row-level X-lock

  -- guard: bail if any seat is not FREE
  UPDATE show_seat SET status='HELD', booking_id='res-...', expires_at=NOW()+300
   WHERE show_id = 'show-7891' AND seat IN ('J-12','J-13')
     AND status = 'FREE';
COMMIT;

Three things make this watertight: (a) SERIALIZABLE blocks phantom inserts on the same seat-row, (b) FOR UPDATE takes an exclusive row-lock so the second transaction physically waits, and (c) the AND status = 'FREE' guard in the UPDATE means even if you got the order wrong, the UPDATE updates 0 rows and we know to bail.

The trade-off you must say out loud: SERIALIZABLE drops MySQL throughput maybe 2-3× compared to READ COMMITTED. We accept this because (1) booking req/s peaks at ~200 — far below MySQL's 5K/s ceiling even at SERIALIZABLE, (2) ticket booking is intrinsically about preventing collisions, not maximizing throughput, and (3) the alternative — a double-booking — is a brand-ending PR disaster. Throughput is recoverable; trust isn't.

Step 9

The Reservation Lifecycle

A single seat moves through a small, strict state machine. Every transition is owned by exactly one component, and every transition is atomic in the DB.

stateDiagram-v2 [*] --> AVAILABLE : show created AVAILABLE --> HELD : reserve · 5-min timer starts HELD --> BOOKED : payment success HELD --> AVAILABLE : timer expires or user cancels BOOKED --> [*] : show ends · archived AVAILABLE --> [*] : show ends · archived

AVAILABLE → HELD

Owned by Booking Service. Triggered by POST /reserve. Must be an atomic SELECT ... FOR UPDATE + UPDATE at SERIALIZABLE isolation. On success, also registers a 5-minute timer in ActiveReservationsService.

HELD → BOOKED

Owned by Booking Service, triggered by a payment-success webhook from Payment Service. Single SQL UPDATE flipping status and clearing expires_at. The reservation timer in ActiveReservationsService is cancelled.

HELD → AVAILABLE (expiry)

Owned by ActiveReservationsService. Triggered by the in-memory 5-minute timer firing. The service runs UPDATE show_seat SET status='AVAILABLE', booking_id=NULL WHERE reservation_id=..., then publishes "seat freed" to WaitingUsersService.

HELD → AVAILABLE (cancel)

User clicks "give up these seats" before the timer fires. Same UPDATE as the expiry path, just triggered by a user action. The timer is cancelled in ActiveReservationsService to prevent a duplicate UPDATE.

Why the state machine matters: with five states and four transitions, every code path that touches a seat row goes through exactly one of these arrows — making bugs (and audits) tractable. The most subtle bug is the race between expiry and cancel: both paths could try to UPDATE the same row from HELD → AVAILABLE. We solve it with idempotency: both UPDATEs include WHERE status='HELD', so the second one updates 0 rows and silently succeeds.

Step 10

WaitingUsersService — The Fairness Queue

When a show goes "full", the next 100 fans who try to reserve don't see "sorry, sold out" — they see "you're number 47 in the queue, estimated wait 3 minutes". The queue exists because, on average, 5-10% of HELD reservations expire without payment — meaning even sold-out shows have a steady drip of seats coming back, and someone has to be next in line.

sequenceDiagram participant U as User participant BS as Booking Service participant AR as ActiveReservations participant WU as WaitingUsers participant NS as Notification Note over U,WU: Show is sold out — all 200 seats HELD or BOOKED U->>BS: POST /reserve · 2 seats BS->>BS: SELECT FOR UPDATE — all FREE? no BS->>WU: enqueue user · show_id · 2 seats WU-->>BS: queue_position=47 BS-->>U: 429 Queued · ETR 3min Note over AR: Sarah's HELD timer fires AR->>AR: UPDATE show_seat — HELD → AVAILABLE AR->>WU: seat freed · show_id WU->>WU: pop head of show queue WU->>NS: notify user47 · 5-min window opens NS->>U: push: your seats are available! U->>BS: POST /reserve · clicks deep link BS-->>U: 200 OK · reservation_id

Per-show in-memory FIFO

One queue per show, keyed by show_id. Each entry: {user_id, seats_wanted, enqueued_at, notification_handle}. Stored in memory only — losing the queue on a crash is acceptable because waiters re-poll the seat-map and re-queue if needed.

The 5-minute claim window

When a waiter is woken, they get a 5-minute window to actually click "reserve". If they don't, they expire out (e.g., they fell asleep) and we wake the next in queue. This prevents the queue from stalling on no-shows.

Same-show sharding

The queue lives on the same node as that show's ActiveReservationsService shard. Both are sharded by show_id so all events for one show — expiries, dequeues, notifications — happen on one machine. No cross-node coordination, no race conditions.

Multi-seat priority

If a queued user wants 4 seats but only 2 freed up, we don't dequeue them — we keep them in queue and wake the next user wanting 2 or fewer. Optional: a separate "split" queue for users who'd accept fewer seats, but this is a UX choice with trade-offs.

Step 11

Fault Tolerance

Two stateful services run in memory — ActiveReservationsService and WaitingUsersService. What happens when one of those nodes crashes mid-show?

🔁 ActiveReservationsService — recover from DB

The in-memory timers are derived state — the source of truth lives in the show_seat table as rows where status='HELD' AND expires_at IS NOT NULL. On startup, the service reads all such rows and rebuilds the timer wheel. Held reservations whose timers fired during the crash are immediately expired on recovery.

Failure mode: a 30-second crash means held reservations expire 30 seconds late — annoying for waiters but never wrong. No double-bookings, no lost money.

🌬️ WaitingUsersService — accept loss

The waiting queue is in-memory only with no DB backing. A crash loses every waiter's queue position. We accept this because (a) the cost of persisting every queue mutation is high, (b) the consequence is "fans see a 'queue position lost, please re-queue' banner", and (c) it's reasonable to ask a fan who's been waiting 10 minutes to refresh.

Mitigation: run a primary/standby pair with hot-failover, so an actual crash is rare. Even if the standby's queue is empty, the failover takes seconds and re-queueing is fast.

💾 MySQL — synchronous replicas + automated failover

The Bookings DB is a primary with two synchronous replicas in different availability zones. Writes commit only after at least one replica acknowledges. Automated failover (e.g., via Orchestrator or Aurora's built-in failover) promotes a replica in 10-30 seconds on primary failure.

RPO=0, RTO≈30s — no data loss, brief unavailability.

📥 Payment idempotency

Every /confirm call carries an idempotency key. If the network drops mid-call and the client retries, the Payment Service recognizes the same key and returns the already-charged response — never double-charging the card. The Booking Service does the same when receiving the webhook callback.

Step 12

Data Partitioning

3.6TB fits on one big MySQL box, but a single box can't survive its own failure and can't absorb a release-day spike on its own. We shard the Bookings DB. Choosing the shard key is the most consequential decision in this section.

❌ Shard by `movie_id`

Tempting because a query like "show me all bookings for Marvel" stays on one shard. But: when Avengers releases, every booking in the country flows to one shard. That shard's CPU pegs at 100% while the others sit at 5%. Hot-shard hell.

✅ Shard by `show_id` (with consistent hashing)

A specific show is "Avengers · 21:00 · PVR Forum". There are millions of shows across the catalog and any given show has at most ~500 seats — bounded write volume. Hashing show_id distributes load uniformly even on release day.

Why `show_id` is the right key — three reinforcing reasons

🌡️ Bounded heat

The hottest possible thing is a single sold-out show — say 500 seats × 50 attempts/seat = 25K writes total. That's ~5 minutes of work for a single shard at 200 req/s. Nothing melts.

🎯 Co-located fairness

ActiveReservationsService and WaitingUsersService are also sharded by show_id — same key, same node. All operations for one show land on one server set, so timer expiry and queue dequeue happen without cross-node coordination.

📈 Cheap rebalancing

Use consistent hashing not hash % N. Adding a shard relocates only 1/N of shows instead of all of them — a few hours of rebalance instead of a multi-day migration with the cluster degraded throughout.

Replication on top of partitioning: each shard is replicated 3× across availability zones with synchronous replication on the primary write. Reads from the booking DB (rare — most reads go through the catalog DB) can hit replicas. The browse plane's Cassandra catalog already handles its own multi-region replication.

Step 13

Cache & Load Balancer

Two infrastructure pieces with disproportionate impact on whether release day feels fast or feels broken.

Cache — what to put in Redis, what to leave out

✅ Cache: show metadata

Movie title, cinema name, hall name, showtime, total seats, base price. Effectively immutable for the life of the show. TTL: hours. Hit rate: 99%+. Lives in Redis as JSON blobs keyed by show:<id>.

✅ Cache: seat-map snapshots

A compact bitmap per show showing FREE/HELD/BOOKED for each seat. Refreshed every 1-2 seconds from the bookings DB. Slightly stale is fine — when a fan clicks a stale-FREE seat, the SERIALIZABLE transaction catches the conflict and returns 409. The cache absorbs 50K req/s of seat-map renders.

❌ Don't cache: the reservation itself

The active state of a seat (am I currently HELD by Sarah?) lives only in the bookings DB and ActiveReservationsService. Caching it would invite stale-read double-bookings — not worth it.

Load Balancer — sticky sessions during checkout

The LB does normal round-robin for browse traffic. But the moment a user starts /reserve, we set a sticky cookie pinning that user to the same Booking Service pod for the rest of their checkout (up to 5 minutes). This isn't strictly required for correctness — the DB is the source of truth — but it lets the Booking Service keep small in-memory state (selected seats, payment intent, idempotency key) instead of refetching on every click.

Algorithm choice: start with round robin, upgrade to least-connections once one pod's CPU consistently runs hotter than others. Health checks every 5 seconds; evict unhealthy pods within 15 seconds. The LB itself is an active-active pair behind a single virtual IP — single LB failure is invisible to clients.

Step 14

Interview Q&A

How do you prevent two users from booking the same seat?

Three layers of defense, all in one transaction. (1) SERIALIZABLE isolation level — blocks dirty, non-repeatable, and phantom reads. (2) SELECT ... FOR UPDATE on the seat row — exclusive row lock, the second transaction physically waits. (3) UNIQUE constraint on (show_id, cinema_seat_id) in show_seat — even if every other safeguard had a bug, the DB would refuse the second insert. With all three, the second user gets a clean 409 Conflict and is offered alternative seats.

Why SERIALIZABLE isolation level — isn't that slow?

Yes, ~2-3× slower than READ COMMITTED — and we accept it. Booking peak is ~200 req/s, well below MySQL's ~1K/s ceiling even at SERIALIZABLE. Lower isolation levels permit phantom reads ("seat is FREE in our SELECT but a concurrent INSERT made it BOOKED before our INSERT lands") — the literal definition of a double-booking. Throughput is recoverable; trust in the brand isn't. Pay the 2-3× tax, sleep at night.

How do you handle a flash sale at 50K req/s?

Three pressure-relief mechanisms. (1) Browse plane absorbs most of it — 95%+ of those 50K are GETs on seat-maps, served by Redis at memory speed. (2) Booking plane is sharded by show_id with consistent hashing — one hot show pegs one shard at ~200 req/s, not the whole cluster. (3) WaitingUsersService throttles writes — once a show is full, new reservers go straight into a queue (no DB write, no lock contention). The bookings DB only sees the ~200 req/s of "actually trying to claim a free seat" requests, never the 50K of "browsing to see if a seat freed up".

What if a user clicks reserve but their network drops before payment?

The 5-minute timer in ActiveReservationsService takes care of it. The seat is HELD with expires_at = NOW() + 300s. After 5 minutes with no /confirm, the timer fires, the seat atomically transitions HELD → AVAILABLE, and WaitingUsersService dequeues the next waiter. The user who lost their network sees an "expired" message on their next page-load. No double-charge (payment was never started), no stuck seat.

Why split ActiveReservations and WaitingUsers into separate services?

They have different lifetimes and different failure modes. ActiveReservations is the timer keeper — it must be DB-backed so a crash doesn't extend a HELD seat indefinitely (correctness concern). WaitingUsers is the fairness queue — losing it on crash just means waiters re-queue (UX concern, not correctness). Smashing them together would force the simpler queue to inherit the heavier persistence requirements, doubling the DB write rate for no benefit. Separation also lets them scale and fail independently.

How is fairness enforced when 100 users want the same 10 seats?

Per-show FIFO queue in WaitingUsersService. The first 10 users to click "reserve" get HELD seats. Users 11-100 are appended to the per-show queue with their enqueued_at timestamp. When any of the 10 HELD reservations expires, ActiveReservationsService notifies WaitingUsers, which pops the head of the queue (the longest-waiting user) and gives them a 5-minute window to claim. If they don't, we wake the next in line. First-come-first-served, enforced by a queue, not by who happens to refresh fastest.

Why MySQL for bookings instead of Cassandra or DynamoDB?

Multi-row ACID transactions. A 4-seat order needs to atomically claim 4 specific rows or fail entirely — partial booking is unacceptable. MySQL InnoDB does this in 5 lines of SQL with row-level locks. Cassandra's lightweight transactions are per-partition only and 10× slower. DynamoDB's transaction API caps at 25 items but is comparatively expensive and adds latency. At 3.6TB total and 200 booking-writes/sec peak, MySQL's vertical-scale ceiling isn't the problem; correctness is, and MySQL nails correctness.

How would you scale this for global launch (multi-region)?

Browse plane goes global, booking plane stays regional. The Cassandra catalog DB replicates to all regions — a fan in Mumbai gets seat-maps from a Mumbai replica with ~10ms latency. The bookings MySQL stays single-region per market (e.g., one cluster for India, one for US) because cross-region SERIALIZABLE transactions would have intolerable latency (100ms+ round trips). Each market's shows live in that market's DB; users only book in their region. For genuine global shows (none in cinema, but consider concert tours), shard by show_id and pin each show to the region with the most local demand.

The one-line summary the interviewer remembers: "It's a sharded MySQL cluster running SERIALIZABLE transactions for atomic multi-seat orders, with a Redis-cached browse plane for 50K req/s read traffic and two cooperating services — ActiveReservations for 5-minute timers and WaitingUsers for fairness queues — that turn a 'whoever refreshes first wins' free-for-all into a deli-counter ticket system."