Evaluate Weigh the pros and cons of technologies, products and projects you are considering.

Virtualization makes big data analytics possible for SMBs

Big data isn't just for large companies with fat wallets. A virtual Hadoop deployment lets budget-conscious SMBs benefit from big data analytics.

Large companies have long used big data analytics to comb through vast amounts of data in a short amount of time. Companies with deep pockets and ever-expanding amounts of data have built large server clusters dedicated to mining data. Hadoop clusters can help small and medium-sized businesses lacking big budgets benefit from big data.

Have you ever wondered how search engines guess what you want to type in the search field before you finish typing it, or offer suggestions of related queries? Maybe you've noticed that Facebook or LinkedIn recommend people you may know based on your connections? These are two examples of Hadoop clusters processing large amounts of data in fractions of a second.

Created by the Apache Foundation, Hadoop is an open source product built to run on commodity hardware. The most expensive part of building a Hadoop cluster involves the compute and storage resources. Server virtualization can help reduce that cost, bringing big data analytics to budget-constrained organizations.

More on big data analytics

Examining the state of PaaS in the year of big data

VMware buys Cetas to bring big data analytics to the masses

Advisory Board roundtable: Big data and its impact on data centers

Virtualization vendors, such as VMware, have spun up initiatives specifically focused on better utilizing virtual workloads for Hadoop environments. VMware's project Serengeti is one such product. Prior to launching Serengeti, VMware published some impressive benchmark results that showed scenarios where a virtual Hadoop environment outperformed a physical one.

It's easy to find information on how to virtualize Hadoop with any of the popular hypervisors. In fact, Apache has even published its own articles about the pros and cons of virtualizing Hadoop environments. One big advantage is that virtualizing Hadoop makes big data analytics more accessible for SMBs because it reduces the compute and storage hardware needed.

By using Hadoop-based data analytics, SMBs can add value to their services. For example, a company could offer a service that matches errors in log files to knowledge base articles from multiple vendors to quickly identify and resolve system errors. As another example, a marketing department using Hadoop-based data analytics could monitor references to a company's name in trade magazines, online blogs, social media and other outlets and then correlate those instances to events or organizational changes.

Hadoop has proven it's a valuable and powerful tool in data analytics. Large organizations have seen its benefits for years. Add in the equalizing factor of virtualization and data analytics tools like Hadoop are a possibility for smaller organizations as well.

The growth of data within organizations is staggering, and the growth of data available online is beyond comprehension. The challenge in the next five years will be harnessing data to find the value in it.

Next Steps

possible cons encountered with Hadoop data analytics

Dig Deeper on Server hardware and virtualization

Join the conversation


Send me notifications when other members comment.

Please create a username to comment.

Can virtualizing big data analytics be a game-changer for SMBs?
Like a lot of changes in IT, there are ways of taking big data and turning it into useful information, that we haven't even conceived of quite yet. I know that larger organizations are, and have been, leveraging big data for years and years now. But the ever-so-slightly-different viewpoint of SMBs, and the fact that their quickness and agility could possibly take advantage of that type of info is going to be interesting to watch.
The major factor for SMBs is cost and consolidating existing infrastructure.