Manage Learn to apply best practices and optimize your operations.

Prevent a rogue server from bringing down your data center

Don't let unplanned changes take a toll on your virtualized environment. Stop rogue servers in their tracks by identifying them and then deciding whether to keep or remove them.

In a perfect world, our applications and servers would run smoothly, day in and day out. There would be no conflicts...

or resource issues, and harmony in the data center would be visible to all. Unfortunately, we don't live in that world. All of the documentation and controls we put on our virtualized environments may help, but despite our best intentions, we're still faced with the issue of the rogue server and how it affects change. This problem is often the result of IT employees working against tight deadlines. Without a streamlined process for change, more than one person will have to get involved to get something done.

Though the purpose for doing this may be justified, it presents an unplanned change to the environment. A single change may not be enough to bring down an entire virtualized infrastructure, but a large quantity of changes or additions can. It can be frustrating for administrators who traditionally have ownership of key systems to learn that changes were made without their knowledge or permission, but remember, such changes are never made out of malice. As virtual admins, it's important for us to take a step back and remove our personal feelings from the equation; it may sound like therapy 101, but a clear head is critical to the overall process.

Digging through logs

You can either remove or keep the rogue server, but both options have drawbacks. While removing the server seems easy, it's not.

The first step to correcting these changes is to investigate and figure out exactly what's going on. The source of the problem can be something as small as resource additions or changes, or as large as new VMs and sweeping configuration changes. The goal here is to pinpoint the change and start asking what, when and who. You absolutely cannot reverse the change(s) without proper investigation. Even if a system is in place illegally, removing it could compound the issue. As an IT person, one of your first duties is to keep things online, so as much as you might want to, you can't just pull the plug.

Figuring out what changes were made can be a little tricky, but if there's one thing computers are great at doing, it's logging. Sometimes the change will be immediately apparent, but, more often than not, admins have to dig deep into logs to find out what the change was and when it was made. Logs are also great because they can tell you who made the change, provided you follow the rule of never using a generic login for key systems and infrastructure. Changes leave footprints that you can trace. When going through logs, pay particular attention to the time and day that the change was made, as it can be beneficial when correlating it to the changes you're seeing.

To remove or to keep

Once you've determined the what, when and who, you can decide on a course of action to address the issue. Now, this can be a little tricky depending on the politics of who owns it. You can either remove or keep the rogue server, but both options have drawbacks. While removing the server seems easy, it's not. Consider this scenario: One of the higher-ups at your organization tells you to remove the server. You go through the proper channels to verify this and then proceed with the removal. Two weeks later, you get a call saying that something important was left on the server, and your higher-ups need it immediately. Of course, by this point it's completely gone, leaving you and your organization in a bind. If a server was created outside the normal channels, chances are not everyone has a clear picture of what's on it, so before removing anything, shut down the server and store its information on a separate Serial Advanced Technology Attachment disk. I'd also recommend waiting a few weeks between receiving the initial order and removing the server, just in case. Consider it a simple insurance policy.

As if removing a rogue server weren't difficult enough, keeping it is even more challenging. Even if the server was created from your base templates, you still have to do a health check. Does the server have proper security, backups and monitoring all configured? What about proper resource management, naming and addressing? All of these things need to be verified as soon as possible. Of course, you may run into a bit of a challenge if the system is online, but these steps need to be taken as quickly as possible. The other key component to this is the approval/request form. When it comes to a rogue server, it's unlikely that the person responsible for its creation ever completed the appropriate paperwork. In order to resolve the issue, you'll need them to fill out the proper approval/request form, even if the server in question is removed. This form creates an audit trail that could come in handy in the future should there ever be a security issue or outage due to the rogue server. Completing the paperwork after the fact can also be a wakeup call for infrastructure abusers and help them realize that they can't skirt the rules. After all, proper channels and rules exist for a reason -- they reduce the headache of cleaning up problems later.

Next Steps

Add visibility for data center monitoring with Cisco Tetration

Get organized with these data center tools

Prevent network breaches with data center visibility

This was last published in August 2016

PRO+

Content

Find more PRO+ content and other member only offers, here.

Join the conversation

2 comments

Send me notifications when other members comment.

By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United States, you consent to having your personal data transferred to and processed in the United States. Privacy

Please create a username to comment.

Do you think it's better to keep or remove rogue servers? Why?
Cancel
I'm a little bit confused; part of this article appears to be about rogue servers and part of it appears to be about IT change in general. 

I have to say if I ordered a server to be shut down, and the IT guy kept it around for a few weeks, just in case, I'd be pretty POed.
Cancel

-ADS BY GOOGLE

SearchVMware

SearchWindowsServer

SearchCloudComputing

SearchVirtualDesktop

SearchDataCenter

Close