Capability Management and Privilege Reduction

Capability Management and Privilege Reduction

Linux capabilities divide root privileges into distinct units that can be individually granted or denied. Containers should run with minimal capabilities required for functionality. Docker's default capability set includes several potentially dangerous capabilities that applications rarely need. Dropping unnecessary capabilities significantly reduces the impact of container compromise.

Understanding capability requirements helps build secure container configurations. Network applications might need CAP_NET_BIND_SERVICE for privileged ports but not CAP_NET_ADMIN. File manipulation might require CAP_CHOWN but not CAP_DAC_OVERRIDE. Systematic capability analysis during development identifies minimal capability sets for production deployment.

# Example: Kubernetes Pod Security Policy with restricted capabilities
apiVersion: policy/v1beta1
kind: PodSecurityPolicy
metadata:
  name: restricted-psp
spec:
  privileged: false
  allowPrivilegeEscalation: false
  requiredDropCapabilities:
    - ALL
  allowedCapabilities:
    # Only allow specific capabilities needed by applications
    - NET_BIND_SERVICE  # Bind to ports < 1024
    - SETUID           # Change user IDs
    - SETGID           # Change group IDs
    - KILL             # Send signals to processes
  volumes:
    - 'configMap'
    - 'emptyDir'
    - 'projected'
    - 'secret'
    - 'downwardAPI'
    - 'persistentVolumeClaim'
  hostNetwork: false
  hostIPC: false
  hostPID: false
  runAsUser:
    rule: 'MustRunAsNonRoot'
  seLinux:
    rule: 'RunAsAny'
  supplementalGroups:
    rule: 'RunAsAny'
  fsGroup:
    rule: 'RunAsAny'
  readOnlyRootFilesystem: true

---
# Example: Docker Compose with capability restrictions
version: '3.8'
services:
  web:
    image: nginx:alpine
    cap_drop:
      - ALL
    cap_add:
      - NET_BIND_SERVICE
      - SETUID
      - SETGID
    security_opt:
      - no-new-privileges:true
      - apparmor:docker-nginx
    read_only: true
    tmpfs:
      - /var/cache/nginx
      - /var/run
      - /tmp
    user: "101:101"  # nginx user
    ports:
      - "8080:80"

The principle of least privilege extends beyond capabilities to user permissions. Running containers as non-root users prevents many privilege escalation attacks. User namespace remapping provides additional isolation by mapping container users to unprivileged host users. Combining capability dropping with non-root execution creates strong security boundaries.