IBM engineer Pratik Sampat released a first prototype CPU namespace interface for the Linux kernel. This CPU namespace was designed to address issues of consistency with current means of viewing available CPU resources as well as to address potential security issues resulting from understanding the access / placement of resources on the system.
One of the driving forces behind this Linux processor namespace proposal is the mixed ways of viewing and managing processor resources right now, “The control and display interface is quite disjointed. Restrictions can be set through control interfaces such as cgroups, while many legacy or otherwise applications get system view through sysfs / procfs and allocate resources such as number of threads / processes, allocation memory based on this information. This can lead to unexpected execution behaviors and have a significant impact on performance.“
Meanwhile, the alternative methods are described as less than ideal in the RFC letter, “Existing solutions to the problem include user space tools such as LXCFS which can tamper with sysfs information by mounting it to the online sysfs file to be consistent with the limits set through cgroup cpuset. However, LXCFS is an external solution and must be explicitly configured for applications that require it. Another concern is also that tools such as LXCFS do not support all other display mechanisms such as procfs load statistics.“
The security implications described include “a case where an actor may be aware of the topology of the CPU node may schedule workloads and select CPUs such that the bus is flooded, causing a denial of service attack” or “A case where identifying the topology of the CPU system can help identify cores close to buses and peripherals such as GPUs to gain an undue latency advantage over the rest of the workloads.”
The IBM-led CPU namespace proposal therefore follows the following design:
This prototype patch set introduces a new kernel namespace mechanism – the CPU namespace.
The CPU namespace isolates CPU information by virtualizing logical CPU IDs and creating an encrypted virtual CPU card of them. It locks to task_struct, and the processor translations are designed to be in a flat hierarchy, which means that each virtual namespace processor is mapped to a physical processor when the namespace is created. The advantage of a flat hierarchy is that the translations are in O (1) and the children do not need to go up the tree to retrieve a translation.
This namespace then allows the control and display interfaces to be context sensitive to the CPU namespace, so that a task in a namespace gets only the view and therefore control of. its CPU and view resources available through a virtual CPU card.
In a CPU namespace experiment while testing with the Nginx web server, “With the CPU namespace, we see the correct number of generated PIDs corresponding to the defined cpuset limits. Memory usage drops from over 92% to 95%, latency is reduced by 64%, and throughput like requests and transfer per second is unchanged.”
There are still a number of known shortcomings in the current design, but the initial performance numbers are exciting. More details on this series of “RFC” fixes for the Linux CPU namespace interface can be found through this mailing list thread. There are also more details of the effort via this web page.