Backend Fundamentals · Quick Reference

DNS · HTTP · Docker
API Gateway · Circuit Breaker

The backend interview essentials — how DNS resolves & caches, every HTTP verb, Dockerfile anatomy, VM vs container deployment, gateway vs LB, and the circuit breaker pattern.

← Back to all deep dives

How DNS Works — From Typing a URL to TCP Handshake

DNS (Domain Name System) is the internet's phonebook: it translates human-friendly names (www.google.com) into routable IP addresses (142.250.192.46). It is a hierarchical, globally distributed key-value lookup.

Browser & OS Cache Check

Before hitting the network, the browser checks its own DNS cache, then the OS resolver cache (nscd / systemd-resolved), then the local /etc/hosts file. If any of these have a fresh answer, DNS resolution ends here.

Recursive Resolver (ISP or 8.8.8.8)

On a cache miss, the query goes to a recursive resolver — usually your ISP, or a public one like Google (8.8.8.8) or Cloudflare (1.1.1.1). This resolver does the heavy lifting on behalf of the client.

Root Nameserver (.)

If the resolver has no cached answer, it asks one of the 13 root nameservers. The root doesn't know the IP, but it knows who runs each TLD. It replies: "Ask the .com TLD server at this address."

TLD Nameserver (.com / .io / .in)

The resolver now asks the TLD server for .com. It replies with the authoritative nameservers for google.com (e.g., ns1.google.com, ns2.google.com — Google runs its own authoritative NS).

Authoritative Nameserver

Finally the resolver asks the authoritative server for the specific record. It returns the final answer — an A record (IPv4), AAAA (IPv6), CNAME (alias), or MX (mail).

www.google.com. A 142.250.192.46 TTL 300 mail.google.com. CNAME googlemail.l.google.com. TTL 3600

Response & Cache Population

The resolver caches the answer per the record's TTL and returns it to the OS → browser. The browser can now open a TCP connection to the IP and begin the TLS handshake + HTTP request.

Typical cold lookup: 20–120 ms. Warm (cached) lookup: sub-millisecond.

Interview Tip

DNS uses UDP port 53 by default (fast, connectionless). Falls back to TCP for responses > 512 bytes, zone transfers, and DNSSEC. DoH/DoT wrap it in HTTPS/TLS for privacy.

Kinds of HTTP Requests — The 9 Methods

HTTP/1.1 defines 9 standard request methods. Each has well-defined semantics: safe (no side effects), idempotent (same result on retry), and whether a body is allowed.

Method Purpose Safe Idempotent Body
GET Retrieve a resource. Most common read operation. Yes Yes No
POST Create a new resource / submit data for processing. No No Yes
PUT Replace a resource entirely at a known URI. No Yes Yes
PATCH Partial update to an existing resource. No No* Yes
DELETE Remove a resource. No Yes Optional
HEAD Like GET but returns headers only — used for metadata / existence checks. Yes Yes No
OPTIONS Describe allowed methods / used in CORS preflight. Yes Yes No
TRACE Loopback — echoes the request for debugging. Usually disabled. Yes Yes No
CONNECT Establishes a TCP tunnel — used for HTTPS through a proxy. No No No

* PATCH Idempotency

PATCH can be idempotent if it sets absolute values ({"status":"PAID"}) but is not when it sends deltas ({"balance":"+100"}). API design choice, not a protocol rule.

How DNS Caching Works — 5 Layers Deep

Every DNS record ships with a TTL (Time To Live) in seconds. Every layer between the client and authoritative server caches the answer until TTL expires. This is why changing an A record takes time to propagate.

🌐 Layer 1 — Browser Cache

Chromium caches DNS for ~60 seconds in memory. Inspect at chrome://net-internals/#dns. Cleared on browser restart.

💻 Layer 2 — OS Stub Resolver

Linux: systemd-resolved or nscd. macOS: mDNSResponder. Windows: the DNS Client service. Honours the TTL returned by the resolver.

📡 Layer 3 — Recursive Resolver

Your ISP / 8.8.8.8 / 1.1.1.1 caches every answer it fetches. This layer absorbs the majority of global DNS load — most queries never hit an authoritative server.

🏢 Layer 4 — Authoritative Server

The source of truth. It does not "cache" but it serves the TTL. Setting a record to TTL 60 before a migration makes downstream caches expire faster.

🧩 Positive vs Negative Caching

Positive cache stores successful answers up to the record's TTL. Negative cache (RFC 2308) stores NXDOMAIN / NODATA responses using the SOA minimum TTL — prevents repeatedly hammering nameservers for domains that don't exist.

$ dig www.google.com +noall +answer www.google.com. 287 IN A 142.250.192.46 ─── remaining TTL at this resolver (sec)

Cache Invalidation Reality

You can't force-invalidate the entire internet's DNS cache. The lever you have: lower the TTL before a change (e.g., drop from 3600 → 60 a day ahead). After propagation, raise it back to reduce lookup load.

What is Docker & Dockerfile?

Docker packages an application with all its dependencies into a lightweight, portable, immutable image, and runs it as an isolated process called a container. A Dockerfile is the plain-text recipe that tells Docker how to build that image — each instruction produces a cached, read-only layer.

📦 Minimal Node.js Dockerfile

FROM node:20-alpine WORKDIR /app COPY package*.json ./ RUN npm ci --only=production COPY . . EXPOSE 3000 USER node CMD ["node", "server.js"]
  • Dockerfile = recipe → Image = compiled artifact → Container = running instance
  • Layers are cached — put stable instructions first, frequently-changing ones last
  • Ship only what you need: multi-stage builds + Alpine / distroless base images

Full Docker Deep Dive

Architecture (dockerd / containerd / runc), union filesystem, every Dockerfile instruction, CMD vs ENTRYPOINT, networking drivers, volumes, namespaces & cgroups, Docker Compose, security, and 12 interview Q&As — see the Docker interview deep dive →

VM Deployment vs Docker Deployment

Both isolate workloads, but at different levels of the stack. VMs virtualize hardware; containers virtualize the OS. That single difference changes startup time, density, and isolation guarantees.

Dimension Virtual Machine Docker Container
Isolation Full hardware-level via hypervisor (KVM, Xen, Hyper-V) Process-level via Linux namespaces + cgroups
OS Each VM ships its own guest OS kernel All containers share the host kernel
Boot time 30 sec – minutes < 1 second
Image size GBs (full OS) MBs (app + deps only)
Density per host ~10s of VMs ~100s – 1000s of containers
Portability Hypervisor-specific (VMDK, VHD, OVA) OCI images run on any Docker/containerd host
Security blast radius Stronger — kernel exploit confined to one VM Weaker — shared kernel is the trust boundary
Typical use Legacy apps, strict multi-tenant isolation, Windows workloads on Linux hosts Microservices, CI/CD, elastic scale-out, dev/prod parity

In Production — You Usually Run Both

Cloud VMs (EC2, GCE) host Kubernetes nodes, which run containers. VM gives the hard isolation boundary; containers give density and deployment velocity on top.

API Gateway vs Load Balancer

Both sit in front of backend services, but they operate at different layers and solve different problems. A load balancer is about traffic distribution; an API gateway is about API concerns.

Dimension Load Balancer API Gateway
OSI Layer L4 (TCP/UDP) or L7 (HTTP) Always L7 (HTTP / gRPC / WebSocket)
Primary job Spread traffic across identical backend instances Provide a single unified entry point for many microservices
Routing By host / path / port (simple) By path, headers, method, version, JWT claims, query params
Authentication Not its job (TLS termination only) Built-in — JWT, OAuth, API keys, mTLS
Rate limiting Basic (connection limits) Per-client, per-route, tiered plans
Transformation None — passes bytes through Request/response rewrite, protocol translation (REST↔gRPC), response aggregation
Health checks Core feature — auto-remove unhealthy nodes Yes (usually via the underlying LB)
Examples AWS ALB/NLB, HAProxy, Nginx, F5 AWS API Gateway, Kong, Apigee, Zuul, Tyk

🧭 Typical Flow in Production

client │ ▼ DNS (Route53) │ ▼ LOAD BALANCER ← L4/L7, TLS termination, spreads traffic │ ▼ API GATEWAY ← auth, rate limit, route, transform │ ├─▶ users-service ├─▶ orders-service └─▶ payments-service

Rule of Thumb

Use an LB to scale identical replicas of one service. Add an API gateway when clients need to reach many services behind a single domain, with shared auth/quota/routing logic.

Circuit Breaker Pattern — Interview Questions

The circuit breaker protects a service from cascading failures when a downstream dependency is slow or broken. Like an electrical breaker: after too many "shocks" (errors), it trips open and short-circuits calls instead of letting them pile up.

The Three States

failure threshold hit CLOSED ───────────────────────▶ OPEN ▲ │ │ success in HALF-OPEN │ cooldown timer │ ▼ └───── HALF-OPEN ◀──────── (trial request)
  • CLOSED — all calls flow through; errors are counted.
  • OPEN — calls fail fast without hitting the dependency; return fallback / cached response.
  • HALF-OPEN — after cooldown, let a few probe requests through. If they succeed → CLOSED. If they fail → back to OPEN.
Why use a circuit breaker at all — can't I just set a timeout?
Timeouts fail each request individually; under load the thread pool still fills up and callers still queue. A breaker fails fast so the upstream thread returns immediately, freeing resources and preventing the failure from propagating through the call graph.
What triggers the breaker to OPEN?
Typically a failure ratio over a sliding window — e.g., > 50% errors in the last 20 calls, or in the last 10 seconds. Resilience4j uses a count-based or time-based sliding window; Hystrix used a 10-second rolling stats bucket.
What counts as a "failure"?
Configurable — usually exceptions, timeouts, and explicit 5xx responses. 4xx from downstream is a client problem, not a dependency problem, and normally should not trip the breaker.
How do you pick the thresholds?
Start from the dependency's SLA. If it's supposed to be 99.9% available and < 200ms p95, set failure threshold at 50% over 20 calls with a 1s timeout. Tune from production telemetry — too tight and you trip on noise; too loose and the breaker never helps.
What should the fallback return?
Depends on the call. Options: (1) cached last-good response (reads), (2) a degraded placeholder (e.g., empty recommendations list), (3) queue the request for later (writes), or (4) a clean error to the client. Never return stale financial/medical data silently.
Circuit breaker vs retry vs bulkhead — how do they compose?
They layer. Retry handles transient blips (1–2 attempts with jitter). Bulkhead isolates thread pools / semaphores so one slow dep can't starve others. Circuit breaker wraps both and short-circuits when the dep is clearly down. Typical order in code: Bulkhead → CircuitBreaker → Retry → TimeLimiter → call.
How is it implemented in a distributed system?
Usually per-instance, in-process — each service instance tracks its own stats (Resilience4j, Polly, gobreaker). Sharing state across instances via Redis is rarely worth the added dependency and latency. Service meshes (Istio/Envoy) push this to the sidecar so app code stays clean.
What metrics do you emit?
State transitions (closed→open is a paging event), failure rate, slow-call rate, number of rejected calls, and calls in each state. Alert on repeated open→half-open→open cycles — that means the dep is flapping, not recovering.
Common pitfalls?
  • Tripping on 4xx — exclude them.
  • No jitter in retries before the breaker — creates thundering herds when OPEN→HALF-OPEN.
  • Sharing one breaker across multiple endpoints of the same host — mask isolated problems. Use per-operation breakers.
  • No fallback — a tripped breaker that throws still propagates errors; give callers a graceful path.

Libraries to Know

Java: Resilience4j (current), Hystrix (deprecated 2018). .NET: Polly. Go: gobreaker, sony/gobreaker. Node.js: opossum. Service mesh: Istio outlier detection, Envoy circuit breaking.