I am well into my company’s physical to virtualization (P2V) migration for most general purpose server systems, and it’s been pretty successful. But as our environment grew, we experienced a problem involving our virtual systems and our antivirus management system. In this blog post, I’ll explain the problem and tell you the solution so you can avoid a similar situation.
For most server systems, regardless of whether they are physical or virtual, maintaining a centrally managed antivirus package is a good strategy. This strategy includes regular definition updates, engine updates, policies for exclusions and scheduled full scans.
Let’s talk about the scheduled full scan. Historically, we regularly ran a full antivirus scan of the local file system on both the physical and virtual servers during off hours. This became a problem as the virtual environment became more populated.
We use the vkernel capacity analyzer and chargeback virtual appliance to monitor the performance of our virtual environment. What I noticed is that during the off time, we had an incredible spike in CPU utilization across all hosts and virtual machines. This spike was about 300% of our average CPU use for about two hours. We initially wanted to blame it on the full backup that happens close to this timeframe, but closer investigation led us elsewhere.
We had noticed that the CPU spike occurred on guest systems that are in isolated networks for stage-configure or isolation test roles. With the isolated systems, it was determined that the spike would not be caused by the full backup, as the the isolated systems were not able to communicate with the backup mechanism.
Avoiding the CPU spike
Once we determined that the scheduled full antivirus scan on the local file system was the culprit, we decided that a staggered set of full scans were required to avoid this massive spike. On physical systems with local processing, this is not a big issue, as they are generally idle. But applied to the virtual environment, this may cause unnecessary virtual machine migrations or performance alerts. So, in your migration strategies, be sure to consider any centrally scheduled activity like this and how it may affect your entire infrastructure — both physical and virtual.