A new CVE, (CVE-2016-9962), for the docker container runtime and runc were recently released. Fixed packages have been prepared and shipped for RHEL as well as Fedora and Centos. This CVE reports that if you exec
d into a running container, the processes inside of the container could attack the process that just entered the container.
If this process had open file descriptors, the processes inside of the container could ptrace
the new process and gain access to those file descriptors and read/write them, even potentially get access to the host network, or execute commands on the host.
Stopping 0-Days with SELinux
It could do that, if you aren’t using SELinux in enforcing mode. If you are, though, SELinux is a great tool for protecting systems from 0 Day vulnerabilities.
Note: SELinux can prevent a process from strace-ing another process if the types or MCS Labels are not the same, but when you exec into a container, docker/runc sets the labels to match the container label.
Mainly this is a host-based attack. This is where SELinux steps in to thwart the attack. SELinux is the only thing that protects the host file system from attacks from inside of the container. If the processes inside of the container get access to a host file and attempt to read and write the content SELinux will check the access.
Example of SELinux protecting the file system.
Let’s look at an example. Imagine you exec
‘d a process which had an open file descriptor for write to a file in the user’s homedir into a malicious container. Let’s even imagine it is ~/.bashrc
. With this vulnerability the container process could write this file. The container processes could write lines that would be executed the next time the admin logged in.
What happens then? You’re PWNED.
Examining this from an SELinux point of view. Container processes run as the container_t
type. Files in the users homedir is either labeled user_home_t
or admin_home_t
(/root).
# ls -Z ~/.bashrc
system_u:object_r:admin_home_t:s0 /root/.bashrc
# ls -Z ~dwalsh/.bashrc
unconfined_u:object_r:user_home_t:s0 /home/dwalsh/.bashrc
Here is a simulation of what would happen if a container process had open file descriptors to the .bashrc files in the admins home directories. Here, runcon
changes the SELinux label of the process running echo. The bash script leaks in open file descriptors to .bashrc to the echo command. But SELinux prevents the “hello” from actually being appended to the file.
# runcon system_u:system_r:container_t:s0:c1,c2 echo hello >> ~dwalsh/.bashrc
# runcon system_u:system_r:container_t:s0:c1,c2 echo hello >> ~/.bashrc
Here are the generated SELinux error messages, AVC’s, from /var/log/audit/audit.log.
NOTE: These are dontaudited by default, semodule -DB
turns off dontaudit rules.
type=AVC msg=audit(1484144451.791:20771): avc: denied { `append` } for pid=22100 comm="echo" path="/root/.bashrc" dev="sda3" ino=1576650 scontext=system_u:system_r:container_t:s0:c1,c2 tcontext=system_u:object_r:admin_home_t:s0 tclass=file permissive=0
type=AVC msg=audit(1484144534.340:21027): avc: denied { `append` } for pid=22479 comm="echo" path="/home/dwalsh/.bashrc" dev="dm-0" ino=262758 scontext=system_u:system_r:container_t:s0:c1,c2 tcontext=unconfined_u:object_r:user_home_t:s0 tclass=file permissive=0
SELinux blocked the container process from being able to write to the .bashrc file even if bash opened the descriptor to it and passed it to the container process.
How about a more visually enticing demo? Check out this animation:
What about network connections?
Now let’s look at what would have happened if the process exec
‘d into the container had an open socket to the internet.
Most likely the user processes that is joining the container would be labeled unconfined_t
. The default type for all SELinux users. When you open a tcp_socket, SELinux assigns the label of the process to the label of the socket by default, so the socket would end up being labeled unconfined_t
. Now if the container processes attempt to write to the open socket SELinux would check the access. Here I use the sesearch tool to query SELinux policy if this access would be allowed.
# sesearch -A -s container_t -t unconfined_t -c tcp_socket -p write
Are you setenforce 1
?
As you can see no allow rules are there so SELinux would deny access. When we heard about this vulnerability we were glad to see that our customers were safe if running containers with setenforce 1
.