Reducing CPU ready time will get your hungry VMs a seat at the table

Throwing vCPUs at a slow VM is not the answer to better performance when CPU ready time is high.

What is CPU ready time and how does it affect VM performance?

VMware defines CPU ready time as the "percentage of time that the virtual machine was ready, but could not get scheduled to run on the physical CPUs." That sounds straightforward to VMware administrators, but try giving that explanation to management or an executive complaining about a slow VM.

The other day I had a developer run into my office complaining about a slow VM. He told me that he had four vCPUs and would need eight, or maybe even 16 to get the performance he needed. Looking at performance charts, I could see vCPU usage was averaging 10% over the last hour and his ready state was averaging 10,000 milliseconds. I, being a calm VMware guy, told the developer that instead of four, eight or 16 vCPUs, he'd actually need just two.

He responded by asking if I was crazy. I would have preferred to give him one vCPU, but I didn't think he'd go for that. After agreeing to give him two vCPUs, resign if I was wrong, and agreeing that if I was right he'd buy me lunch, I had him shut down the server and I made the change from four to two vCPUs. An hour later, he came back and asked where I would like him to take me for lunch. The vCPUs usage was 60% on average and the ready state was 10 milliseconds. Not only could he finish parsing logs in about 10 minutes, but the six hours of backlogs processed in 30 minutes. How, against all common sense, does having a single vCPU do more work than four vCPUs? The answer is ready state.

As we went to lunch, I thought about how to explain how ready state works in a way anyone can understand.

Think of the server's CPUs as a family-style restaurant, where everyone sits together at a long table that seats 24 people. You show up with a group of eight people, without a reservation. As you're waiting, two people leave, and a group of two people who just came in are immediately offered their spot at the table. This scenario could go on for a while, with two leaving and two new guests taking their place as you continue to wait. Finally, after 90 minutes of waiting, your group is seated.

This is similar to how ready state works. Remember, just because it's a virtual CPU doesn't mean the basic principles of how a CPU works are thrown out the window. In order for eight vCPUs to work together, they need eight physical CPUs -- just like in the restaurant, where you would need eight seats. In this example, CPU ready time is the amount of time that your party was ready to sit, but had to wait for an opening.

So, if you have 24 CPUs (or seats at a table) and you have one vCPU (or one guest) you have 24 potential opportunities to sit, for 24:1 odds. If you have eight vCPUs (or a group of eight hungry diners) you have a lower chance of getting seats -- 3:1 odds. So, a single virtual CPU may have to work harder, but it will get more time on the physical CPUs -- eight times (or more) as much time depending on the workloads that share those CPUs.

Dig Deeper on Virtual machine performance management