shefkate - Fotolia
In years past, the goal of troubleshooting was to find what's "broken" and then implement the proper fix. Perhaps this meant exchanging a failed network cable, replacing a crashed disk drive or identifying a bad dual in-line memory module in a server. Today, however, the idea of what is "broken" has shifted. As businesses come to regard IT as a service provider and broker, the goal isn't as much about fixing things as much as it's about figuring out whether applications and services are delivering the availability and performance that the business needs.
We don't worry about break-and-fix hardware devices that much anymore -- load-balanced clusters can usually keep applications available, and snapshots can restore a recent application state when things really go sideways. We actually want to know whether an application's latency is acceptable, if the number of transactions per second is within tolerable limits and so on. And we rely on an array of powerful tools, like application performance monitoring (APM) software, to give us the metrics we need.
Containers fit into this realm of performance troubleshooting, but the time horizon has shrunk to almost negligible levels. Unlike physical machines that might run for years or VMs that go for months, orchestration and automation can spin up a container, run it and then release it again in just a few seconds -- maybe less.
This impacts several aspects of container performance troubleshooting. Beyond APM -- is application X running right? -- IT administrators will need to know how to follow container resource usage patterns over time: how many are running at what times of day, where the containers are being deployed and how that activity is translating to CPU, memory, storage and network traffic. IT professionals will need tools that can translate APM metrics into more granular resource tracking and reporting so people can tell when it's time to upgrade or repair hardware. It's this potential fluidity in resource demand that makes containers and demand-based scalability so attractive for public cloud deployments.
Next, IT professionals will probably not be able to see the impact of containers on an application workload. A container that spins up and releases after three seconds will probably leave no perceivable effect on the application, but there can certainly be errors and alerts to sort out through some kind of management dashboard. Container performance troubleshooting will rely heavily on logs to record container activities and log analytics to correlate those activities to system logs, APM results and other log sources.
Third, container performance troubleshooting will need to track and report the intricate interdependencies that can arise between containers -- especially in complex container architectures, such as microservices-based applications. IT professionals will need to see how a change in one container cluster affects upstream and downstream container clusters to offer cause-and-effect insights into application behavior. For example, watching the number of API calls between containers can help gauge utilization traffic, while watching the number of failed API calls can drive container scaling to help ensure continued performance.
Ultimately, the real challenge of container performance troubleshooting will be keeping pace with the speed and scalability that are commonplace in ephemeral data center environments.
Effectively manage and monitor containers
Dig Deeper on Application virtualization
Related Q&A from Stephen J. Bigelow
Just because software passes functional tests doesn't mean it works. Dig into stress, load, endurance and other performance tests, and their ... Continue Reading
Don't neglect form factor as part of your data center server selection. Instead, figure out what type of environment you need and learn which server ... Continue Reading
Learn how load balancing in the cloud differs from a traditional network traffic distribution, and explore the different services available from AWS,... Continue Reading