Differentiating C-state and P-state in server power management

C-state and P-state both offer options for managing system power, but they are very different from one another.

What is the difference between C-states and P-states in server power management? How do these states relate to...

hypervisor power management? Should I define my own power policies?

C-states and P-states both relate to system power management, but both designations are very different. C-states are all processor idle states (except for C0 where the processor is running normally). States like C1, C1E, C3 and C6 all indicate systematically deeper power conservation modes that a processor core can enter when it is idle. By the time it reaches C6, the core is almost powered off completely. Users never notice the effects of C-states because the core is idle anyway. For example, dimming a room light can save a little power, and turning a room light off completely can save much more power. If you're powering down the light because nobody is in the room, nobody will notice. The same idea is true for processors. If the core has nothing to do, users and workloads won't notice if the core is slowed or almost completely off.

By comparison, P-states represent operating conditions where the processor is saving power while still performing useful work. A common example of a P-state is found in the system's power profile where a "low power" profile will lower the processor voltage and clock frequency. This higher P-state can save significant power, but it will certainly impact the workload's performance (though it may still be perfectly acceptable, depending on how critical the affected workloads are). It's important to note that C-states and P-states are independent.

Generally, the three traditional power profiles -- high performance, balanced and low power -- are adequate for the vast majority of deployment scenarios. Many enterprise servers will be configured for high performance or balanced modes. It's rare to see enterprise servers configured for a low power profile in a production setting, though they may appear in nonproduction deployments like test and development.

It is certainly acceptable to establish custom power policies through a hypervisor like VMware ESXi. For example, suppose an ESXi parameter defines the number of times per second that a hypervisor will evaluate the P-state for each processor core. The value or benefit of tweaking such a parameter directly demands a keen awareness of workload needs and behaviors. Otherwise it's just wasted effort that could just as easily compromise workload performance while saving little power. Most organizations simply use the default profiles.

Is it worth enabling the processor's "turbo mode" when a hypervisor controls server power?

As with C-states, it's a good idea to enable any enhanced P-state (such as a turbo mode) available in the system BIOS. Turbo Boost is a feature found on some Intel Xeon servers that works similar to overclocking in that it allows CPU cores to run faster than their base frequency. However, it's important to evaluate the interaction of C-states with any "turbo" behavior. For example, workloads designed to be non-threaded (or lightly threaded) can often see a performance boost when a turbo mode is enabled on top of deep C-states like C3 or C6.

This works because deep C-states put some of the processor's unused cores into an inactive state, allowing the clock speed to greatly accelerate the remaining cores and improve workload performance (at the cost of more power and heat in those cores). But when all the cores are working normally, turbo mode may not be able to significantly increase the clock speed, limiting its benefit.

Similarly, workloads that do rely on processor multithreading can be extremely sensitive to the kind of latency introduced when processors return from deep C-states, so a turbo mode may not benefit a multithreaded workload when deep C-states are enabled. In this case, it may be necessary to disable the deepest (or all) C-states in BIOS.

Any changes to power settings should include a careful benchmark initiative to gauge workload performance before and after any changes. This allows IT professionals to evaluate the impact of power settings and determine if the power savings benefits are worth any noticeable decline in workload performance. Also remember that one setting may not apply to every server, and mission-critical servers may be configured to run in high-performance mode, while secondary servers may be able to take advantage of power savings in a balanced -- or even low power -- mode.

Dig Deeper on Green data center: Reducing power consumption with virtualization