Docker — Complete Interview Deep Dive

Docker is an open-source platform that packages an application with all its dependencies into a lightweight, portable, immutable image, and runs it as an isolated process called a container. It solves the classic "works on my machine" problem by giving every environment — dev, CI, staging, prod — the exact same runtime.

Section 1

Docker Architecture

Docker follows a client–server model. The CLI talks to the daemon over a Unix socket or REST API; the daemon pulls images from a registry and uses the Linux kernel to run containers.

┌─────────────┐ REST / socket ┌────────────────────────┐ │ docker CLI │ ────────────────▶ │ dockerd (daemon) │ └─────────────┘ │ ├── images │ │ ├── containers │ │ ├── networks │ │ └── volumes │ └──────────┬─────────────┘ │ gRPC ▼ ┌─────────────────────┐ │ containerd │ ◀── OCI runtime │ └── runc │ (actually └─────────────────────┘ spawns the │ container) ▼ Linux kernel: namespaces + cgroups + UFS ┌────────────┐ │ Registry │ Docker Hub · ECR · GCR · GHCR · Artifactory · Harbor └────────────┘

Docker Client — the docker CLI that talks to the daemon.
Docker Daemon (dockerd) — long-running process that builds images, manages containers, networks, volumes.
containerd — high-level container runtime that handles image pull/push and lifecycle.
runc — low-level OCI runtime that actually clone()s the container process.
Registry — stores & distributes images (public: Docker Hub; private: ECR, GCR, Harbor).

Section 2

Dockerfile → Image → Container

Three words that sound similar. Interviewers love the distinction.

Term	What it is	Analogy
Dockerfile	Plain-text recipe with build instructions. Text only, no binaries.	Class source code (`.java`)
Image	Immutable, versioned, layered artifact produced by `docker build`. Stored in a registry.	Compiled class / jar
Container	A running instance of an image — a live process with its own filesystem, network, PIDs.	Object instance

Section 3

Image Layers & Union Filesystem

Each Dockerfile instruction (FROM, RUN, COPY, …) produces a read-only layer. Layers are stacked by a union filesystem (overlay2 is the default driver) and presented as a single merged directory to the container. When you start a container, Docker adds a thin read-write layer on top — all runtime changes go there (copy-on-write).

container RW layer ← changes at runtime (writable) ───────────────────── layer 4: COPY dist/ . ← app code (read-only) layer 3: RUN npm ci ← node_modules (read-only) layer 2: COPY package.json ← manifest (read-only) layer 1: FROM node:20-alpine ← base image (read-only)

Reuse — identical layers are shared across images on disk (big space win).
Caching — docker build reuses a layer if its instruction + inputs are unchanged. Order matters: stable steps first, frequently-changing last.
Immutability — you can't edit a layer; a new image = new layers + new hashes.
Storage drivers — overlay2 (default, fast), btrfs, zfs, legacy aufs.

Section 4

All the Dockerfile Instructions

You only need a handful daily, but interviews can ask about any of them.

Instruction	Purpose
`FROM`	Base image; first instruction. Multi-stage uses multiple `FROM`s.
`ARG`	Build-time variable. Only available during `docker build`.
`ENV`	Runtime env var. Persists in the image and in running containers.
`WORKDIR`	Sets cwd for subsequent instructions and for the running container.
`COPY`	Copy files from build context into the image. Preferred for local files.
`ADD`	Like COPY, but also extracts tar archives and fetches URLs. Avoid unless needed.
`RUN`	Executes a command at build time in a new layer. Used for installing deps.
`CMD`	Default command/args run when a container starts. Overridable by `docker run <cmd>`.
`ENTRYPOINT`	The "main" executable. Harder to override. CMD supplies its args.
`EXPOSE`	Documents the listening port. Doesn't publish — use `-p` for that.
`VOLUME`	Declares a mount point for persistent / shared data.
`USER`	Switches UID/GID for subsequent RUN and the container process. Security win.
`HEALTHCHECK`	Periodic command that marks the container healthy/unhealthy.
`ONBUILD`	Triggers a command when this image is used as a base. Rare.
`STOPSIGNAL`	Signal sent on `docker stop` (default SIGTERM).
`LABEL`	Key/value metadata. Maintainer info, build SHA, etc.
`SHELL`	Overrides the default shell (`/bin/sh -c`) used by RUN/CMD/ENTRYPOINT.

Section 5

CMD vs ENTRYPOINT — The Classic Question

Both define what the container runs. The difference is how they combine and how easy they are to override.

# exec form (preferred — no extra /bin/sh process, correct signal handling) ENTRYPOINT ["java", "-jar", "app.jar"] CMD ["--spring.profiles.active=prod"] # at runtime: docker run myapp # → java -jar app.jar --spring.profiles.active=prod docker run myapp --debug # → java -jar app.jar --debug docker run --entrypoint sh myapp # → sh (ENTRYPOINT replaced)

ENTRYPOINT sets the executable — think "this container IS a java process".
CMD provides default arguments — easy to swap per docker run.
Exec form ["a","b"] runs directly via execve() — receives signals. Shell form a b runs under /bin/sh -c, which eats signals (zombie problem).
Use ENTRYPOINT + CMD for apps; use only CMD for general-purpose images where users will override the command.

Section 6

COPY vs ADD

Feature	COPY	ADD
Copy local files	Yes	Yes
Auto-extract local tar	No	Yes (`.tar`, `.tar.gz`, `.tar.xz`)
Fetch remote URL	No	Yes (doesn't auto-extract URLs)
Recommendation	Default choice — predictable, explicit	Only when you need the extra behavior

Section 7

Multi-Stage Builds — Shipping Only the Runtime

A multi-stage Dockerfile uses multiple FROM sections. The final stage COPY --from=<stage>s only the build artifacts it needs — JDK & Maven stay behind, the final image ships JRE + jar.

# ── stage 1: build ── FROM node:20-alpine AS builder WORKDIR /app COPY package*.json ./ RUN npm ci --only=production COPY . . RUN npm run build # ── stage 2: runtime ── FROM node:20-alpine WORKDIR /app ENV NODE_ENV=production COPY --from=builder /app/node_modules ./node_modules COPY --from=builder /app/dist ./dist EXPOSE 3000 USER node CMD ["node", "dist/server.js"]

Why it matters: a Java multi-stage build typically drops image size from ~800 MB (full JDK + Maven + cache) to ~180 MB (JRE + jar) — faster pulls, faster cold starts, smaller attack surface.

Section 8

Docker Networking — The 5 Drivers

Driver	Behaviour	When to use
bridge (default)	Private virtual network on the host. Containers get an IP; port publish via `-p`. DNS by container name on user-defined bridges.	Single-host dev, most Compose setups
host	Container shares the host's network namespace — no isolation, no `-p`, zero NAT overhead.	Perf-critical workloads, low-latency servers
none	No networking at all. Container has `lo` only.	Batch jobs that don't need the network
overlay	Multi-host network across a Swarm cluster. VXLAN tunnels under the hood.	Swarm services / multi-node apps
macvlan	Assigns a MAC + IP on the physical LAN. Container appears as a real host on the network.	Legacy apps that need a real IP on the LAN

# publish container port 3000 on host port 8080 docker run -p 8080:3000 myapp # containers on the same user-defined bridge can reach each other by name docker network create mynet docker run -d --name db --network mynet postgres docker run --network mynet myapp # can reach postgres at host "db"

Section 9

Data Persistence — Volumes vs Bind Mounts vs tmpfs

Container filesystems are ephemeral — when the container is removed, its RW layer goes with it. For persistent or shared data, mount storage from outside.

Type	Managed by	Use case
Named Volume	Docker (in `/var/lib/docker/volumes/`)	DB data, anything that needs to survive restarts. Portable, backupable.
Bind Mount	You — any host path	Dev workflow: mount source code into the container for live reload.
tmpfs	Host RAM, never on disk	Secrets / scratch files you never want persisted.

# named volume docker run -v pgdata:/var/lib/postgresql/data postgres # bind mount (dev) docker run -v $(pwd):/app node:20-alpine # tmpfs docker run --tmpfs /run:rw,size=64m myapp

Section 10

How Container Isolation Actually Works

A container is just a Linux process (or tree of processes) with aggressive kernel-level isolation. Two kernel features do the work.

🧱 Namespaces — what the process can see

pid — its own PID 1, can't see host processes
net — its own network stack, interfaces, ports
mnt — its own mount table & rootfs
uts — its own hostname
ipc — its own semaphores / shared memory
user — its own UID/GID mappings
cgroup — hides host cgroup hierarchy

📏 cgroups v2 — how much the process can use

CPU shares, memory limits, block-I/O weights, PIDs count. docker run --memory=512m --cpus=1.5 writes directly to cgroup files under /sys/fs/cgroup/.

🛡️ Capabilities · seccomp · AppArmor/SELinux

Capabilities — Docker drops most Linux capabilities by default (e.g., CAP_SYS_ADMIN). Add/drop with --cap-add / --cap-drop.
seccomp — syscall filter applied by default to block ~44 dangerous syscalls.
AppArmor / SELinux — MAC policies further restrict what the container can ask the kernel to do.

Section 11

Docker Compose

Compose defines a multi-container app in a single YAML — services, networks, volumes, env. One command (docker compose up) brings up the whole stack. Ideal for local dev and simple single-host deployments.

services: api: build: . ports: ["3000:3000"] environment: - DATABASE_URL=postgres://user:pass@db:5432/app depends_on: [db, redis] db: image: postgres:16-alpine volumes: [pgdata:/var/lib/postgresql/data] environment: POSTGRES_PASSWORD: pass redis: image: redis:7-alpine volumes: pgdata:

Section 12

Commands You'll Be Asked About

# Images docker build -t myapp:1.0 . # build from Dockerfile in . docker images # list local images docker pull nginx:1.25 # fetch from registry docker push myrepo/myapp:1.0 # publish docker tag myapp:1.0 myrepo/myapp:1.0 # retag before push docker rmi myapp:1.0 # remove image docker history myapp:1.0 # inspect layers # Containers docker run -d -p 80:80 --name web nginx # detached, port mapped docker ps # running docker ps -a # including stopped docker logs -f web # tail stdout/stderr docker exec -it web sh # shell inside running container docker stop web && docker rm web # stop + remove docker inspect web # low-level JSON metadata docker stats # live CPU/mem/IO # Cleanup docker system prune -a --volumes # nuke unused images/containers/volumes

Section 13

Best Practices

Image Best Practices

Pin versions — node:20.11.1-alpine, never latest
Copy package.json / pom.xml before source to maximize cache hits
Use multi-stage builds — ship only runtime artifacts
Prefer Alpine or distroless for small surface area
Combine RUNs with && to keep layer count down
Use .dockerignore — strip node_modules, .git, secrets
Add HEALTHCHECK — lets orchestrators know when to route traffic

Security Best Practices

Run as non-root (USER node) — never PID 1 as root
Read-only root filesystem (--read-only)
Drop capabilities (--cap-drop=ALL, add only what's needed)
Scan images (Trivy, Snyk, Docker Scout)
Never bake secrets into layers — use build secrets / runtime env / Vault
Sign images (Docker Content Trust / Cosign)
Keep base images patched; rebuild on CVE alerts

Section 14

Interview Q&A

Is Docker a VM? Why is it so much lighter?

No. A VM virtualizes hardware via a hypervisor and ships its own guest OS kernel (GBs, 30s+ boot). Docker virtualizes the OS — containers share the host kernel and are just processes isolated by namespaces/cgroups. Result: MB-sized images, sub-second boot, 100s–1000s of containers per host.

What's the difference between docker run, docker start, and docker exec?

run creates and starts a new container from an image. start restarts an existing stopped container. exec runs a new command in an already-running container (used for exec -it sh to debug).

Why does changing a line near the top of my Dockerfile invalidate so much cache?

Layers are cached sequentially — once a layer changes, every layer after it rebuilds. Put stable instructions (FROM, apt install, COPY package.json + npm install) first, and frequently-changing ones (COPY . .) last.

My container gets SIGKILL'd instead of gracefully shutting down. Why?

Two common causes. (1) CMD is in shell form — PID 1 is /bin/sh -c, which doesn't forward SIGTERM to your app. Use exec form: CMD ["node","server.js"]. (2) Your app ignores SIGTERM and exceeds the stop grace period (default 10s). Handle SIGTERM in code, or raise with --stop-timeout.

How do containers on the same host talk to each other?

On a user-defined bridge network, Docker's embedded DNS lets them resolve each other by container name — http://db:5432 works. On the default bridge there's no DNS, only IPs. Across hosts, use an overlay network (Swarm) or Kubernetes Services.

Where does my data go when I docker rm a container?

Anything written to the container's writable layer is gone. That's why DBs, uploads, logs belong on volumes or bind mounts — those outlive the container. Named volumes live in /var/lib/docker/volumes/ and are removed only with docker volume rm (or docker rm -v).

Why is my image 1.2 GB when the app is only 50 MB?

Usually one of: no multi-stage build (build tools + caches shipped to prod), heavy base image (ubuntu vs alpine vs distroless), un-cleaned package manager caches (apt-get install … without rm -rf /var/lib/apt/lists/*), or files that should have been in .dockerignore.

Docker vs Kubernetes — are they competitors?

No. Docker builds and runs containers on a single host. Kubernetes orchestrates containers across a cluster — scheduling, scaling, service discovery, rolling updates, self-healing. K8s used to run Docker as its runtime; today it uses containerd directly. Docker builds the image; K8s decides where and how many to run.

Can you run Docker on macOS or Windows if containers are Linux-only?

Yes — Docker Desktop runs a lightweight Linux VM (macOS: Virtualization.framework / Hyperkit; Windows: WSL2 or Hyper-V) and the Docker daemon lives inside it. Your CLI talks to that daemon. So technically every Mac-run container is actually running in a hidden Linux VM.

What is the .dockerignore file?

Like .gitignore, but for the build context sent to the daemon. Keeps node_modules, .git, .env, local logs, and test artifacts out of the image — faster builds, smaller images, fewer leaked secrets.

How does HEALTHCHECK differ from EXPOSE?

EXPOSE is purely documentation — it tells readers / orchestrators which port the app listens on. HEALTHCHECK actually runs a command periodically and flips the container's status to healthy/unhealthy. Load balancers & orchestrators use the latter to decide whether to route traffic.

What happens when you run docker run nginx — step by step?

(1) CLI sends a POST /containers/create to the daemon. (2) Daemon looks for nginx:latest locally; if missing, pulls manifest + layers from Docker Hub. (3) Layers are assembled via overlay2 into a rootfs. (4) Daemon asks containerd → runc to clone() a process with new namespaces + cgroups applied. (5) Process starts as PID 1 inside the container; stdout/stderr stream back to the daemon → CLI.

Docker — Everything
You'll Be Asked

Docker Architecture

Dockerfile → Image → Container

Image Layers & Union Filesystem

All the Dockerfile Instructions

CMD vs ENTRYPOINT — The Classic Question

COPY vs ADD

Multi-Stage Builds — Shipping Only the Runtime

Docker Networking — The 5 Drivers

Data Persistence — Volumes vs Bind Mounts vs tmpfs

How Container Isolation Actually Works

🧱 Namespaces — what the process can see

📏 cgroups v2 — how much the process can use

🛡️ Capabilities · seccomp · AppArmor/SELinux

Docker Compose

Commands You'll Be Asked About

Best Practices

Image Best Practices

Security Best Practices

Interview Q&A

One-Line Summary

Docker — EverythingYou'll Be Asked

Docker Architecture

Dockerfile → Image → Container

Image Layers & Union Filesystem

All the Dockerfile Instructions

CMD vs ENTRYPOINT — The Classic Question

COPY vs ADD

Multi-Stage Builds — Shipping Only the Runtime

Docker Networking — The 5 Drivers

Data Persistence — Volumes vs Bind Mounts vs tmpfs

How Container Isolation Actually Works

🧱 Namespaces — what the process can see

📏 cgroups v2 — how much the process can use

🛡️ Capabilities · seccomp · AppArmor/SELinux

Docker Compose

Commands You'll Be Asked About

Best Practices

Image Best Practices

Security Best Practices

Interview Q&A

One-Line Summary

Docker — Everything
You'll Be Asked