Privilege Escalation via Docker / Container Escapes (Conceptual Overview)
Containers (Docker, LXC, Podman) provide operating-system-level virtualization. They share the host kernel but isolate processes, filesystems, and networks. Container escape occurs when a process breaks this isolation boundary and gains access to the host operating system.
π³ What Is Container Escape?
Container escape is a privilege escalation technique where a process running inside a container breaks the isolation boundary and executes code directly on the host system with host privileges.
Since containers share the host kernel, a kernel vulnerability or misconfiguration can allow escape.
Container: Shares host kernel. Lightweight. Less isolation.
Virtual Machine: Separate kernel. Heavyweight. Strong isolation.
docker group can mount
the host filesystem and execute commands as root without any exploit.
This is not a vulnerabilityβit is by designβbut it is a critical misconfiguration.
π§ How Container Escapes Happen (High-Level)
βοΈ Misconfiguration-Based
- β Privileged containers β All capabilities, host device access
- β Host filesystem mounts β
-v /:/hostgives host access - β Docker group membership β Full root without password
- β --pid=host β Process namespace sharing
- β --network=host β Network namespace sharing
π£ Kernel Vulnerability-Based
- β CVE-2019-5736 β runC container escape
- β CVE-2022-0492 β cgroup release_agent escape
- β CVE-2024-21626 β runC/workdir path traversal
- β Dirty Pipe (CVE-2022-0847) β host kernel exploit
π₯ High-Risk Container Misconfigurations
| Misconfiguration | Description | Risk Level | Defensive Action |
|---|---|---|---|
| Docker group membership | Non-root user in docker group | CRITICAL | Remove user from docker group. Use sudo. |
| Privileged containers | --privileged flag used |
CRITICAL | Never use privileged containers. Drop capabilities. |
| Host filesystem mounts | -v /:/host or --mount type=bind,source=/,target=/host |
CRITICAL | Avoid host mounts. Use named volumes. |
| --pid=host | Shares host process namespace | HIGH | Avoid unless absolutely necessary |
| --network=host | Shares host network namespace | HIGH | Avoid; use bridge networks |
| Capabilities not dropped | CAP_SYS_ADMIN, CAP_DAC_OVERRIDE present | HIGH | --cap-drop=ALL, add only required |
π Real-World Examples (Defensive View)
Situation: A developer is added to the docker group for "convenience" to run containers without sudo.
Three years later: Developer changes teams but group membership is never removed. Account is compromised via phishing.
Impact: Attacker runs:
docker run -v /:/mnt -it ubuntu chroot /mnt bash
Result: Full host root access. Entire server compromised.
Defense: Quarterly group membership audits. Never add users to docker group.
Situation: An administrator runs a privileged container to debug a networking issue.
docker run --privileged -it ubuntu bash
Problem: The container is left running. A separate web application vulnerability allows RCE inside the container.
Impact: From the privileged container, the attacker:
- Mounts host disk devices
- Adds SSH key to host root
- Installs persistence mechanism
Defense: Never run privileged containers. Use --cap-add for specific needs.
Situation: Container runs with --read-only for security, but /tmp is mounted as writable.
Vulnerability: Kernel flaw allows mounting filesystems inside user namespaces.
Result: Attacker mounts new filesystem, bypasses read-only restriction.
Defense: Combine --read-only with --tmpfs /tmp:noexec,nosuid and kernel patches.
π Detecting Container Escape Risks
π³ Docker Audit Commands
- β
groups | grep dockerβ Check docker group membership - β
docker ps --quiet | xargs docker inspect --format='{{.Name}} {{.HostConfig.Privileged}}'β Find privileged containers - β
docker ps --quiet | xargs -I {} docker inspect {} --format='{{.Name}} Mounts: {{.HostConfig.Binds}}'β Find host mounts - β
docker ps --quiet | xargs -I {} docker inspect {} --format='{{.Name}} CapDrop: {{.HostConfig.CapDrop}}'β Check dropped capabilities
π‘οΈ Host Detection
- β
auditdrules for mount, nsenter syscalls - β Falco/Tracee β Runtime security monitoring
- β Docker Bench Security β Automated assessment
- β Unusual process ancestry (container parent β host child)
π‘οΈ Preventing Container Escapes
πΉ Runtime Security
- Never add users to docker group β Use sudo
- Avoid privileged containers β Use
--cap-drop=ALL - Read-only root filesystem β
--read-onlyflag - No new privileges β
--security-opt=no-new-privileges:true - Drop all capabilities β Add only required
πΉ Isolation & Configuration
- User namespaces β
--userns=remap - Resource limits β
--memory,--cpus - AppArmor/SELinux β Enable confinement
- Seccomp profiles β Restrict syscalls
- Image scanning β Trivy, Clair, Docker Scout
π Secure Docker Run Example
docker run -d \
--name web-secure \
--read-only \
--security-opt=no-new-privileges:true \
--cap-drop=ALL \
--cap-add=NET_BIND_SERVICE \
--cap-add=NET_RAW \
--user 1000:1000 \
--tmpfs /tmp:rw,noexec,nosuid,size=128m \
--tmpfs /var/run:rw,noexec,nosuid \
--security-opt=apparmor:docker-default \
nginx:alpine
β This container runs as non-root, read-only, with only necessary capabilities, and no new privileges.
π§Ύ Key Takeaways
- β Docker group = root β Never add users to docker group
- β Privileged containers = host access β Never use --privileged
- β Drop all capabilities β Default containers have too many privileges
- β Read-only + no-new-privileges β Prevents container modification
- β User namespaces β Map container root to unprivileged host user
- β Patch kernels β Most vulnerability-based escapes target kernel flaws
π³ Docker/Container β Command Awareness (Defensive Auditing)
Commands used by security teams and system administrators to audit container environments. Shown for defensive hardening and verification only.
π₯ Docker Group Audit
-
Check if current user is in docker group
If output contains "docker", user has root-equivalent access.groups | grep docker -
List all users in docker group
Audit quarterly; remove any non-admin users.getent group docker
π Container Runtime Audit
-
List running containers
docker ps -
Find privileged containers
Privileged: true = CRITICAL findingdocker ps --quiet | xargs -I {} docker inspect {} --format='{{.Name}} Privileged: {{.HostConfig.Privileged}}' -
Find containers with host filesystem mounts
Mounts containing / or /etc = CRITICALdocker ps --quiet | xargs -I {} docker inspect {} --format='{{.Name}} Mounts: {{.HostConfig.Binds}}' -
Check dropped capabilities
Empty or missing CAP_DROP = excessive privilegesdocker ps --quiet | xargs -I {} docker inspect {} --format='{{.Name}} CapDrop: {{.HostConfig.CapDrop}}' -
Check read-only root filesystem
ReadOnly: false = writable container filesystemdocker ps --quiet | xargs -I {} docker inspect {} --format='{{.Name}} ReadOnly: {{.HostConfig.ReadonlyRootfs}}' -
Check user namespace
UsernsMode: host = user namespaces disableddocker ps --quiet | xargs -I {} docker inspect {} --format='{{.Name}} UsernsMode: {{.HostConfig.UsernsMode}}'
π¦ Image Security Audit
-
Scan image for vulnerabilities
Or use Trivy:docker scan nginx:latesttrivy image nginx:latest -
Check image user
Blank or "0" = root. Should be non-root user.docker inspect --format='{{.Config.User}}' nginx:latest
π‘οΈ Remediation Commands (Defensive)
-
Remove user from docker group
sudo gpasswd -d username docker -
Recreate container without privileged mode
docker run --cap-drop=ALL --cap-add=NET_BIND_SERVICE ... -
Enable user namespace remapping
# Edit /etc/docker/daemon.json { "userns-remap": "default" }
π‘οΈ Defender Takeaways
- β Audit weekly: docker group, privileged containers, host mounts
- β Harden: Drop all capabilities, read-only rootfs, no-new-privileges
- β Monitor: Falco/Tracee for runtime container escapes
- β Scan: All images for vulnerabilities and root users