VMware pounced on cloud data analytics startup Cetas Software this week, potentially giving vSphere shops a way to analyze large amounts of data within their virtual or cloud environments.
The virtualization software company announced plans on April 24 to acquire the startup, which runs its software on VMware vSphere or Amazon EC2. It is used for online application, IT and operational analytics as well as enterprise Hadoop analytics. This acquisition gives VMware a data analytics platform and an entry point into a growing market.
“Companies like VMware are beginning to realize developing the [underlying] technology alone is not enough,” said Matthew Cunningham, CIO for CareCore National, a healthcare benefits management firm with data centers in South Carolina and Colorado. “They will need to focus on tools and techniques to deliver the analytics to the decision makers as a core offering.”
VMware’s interest in big data
Big data began with cloud players like Facebook and Google, but has been making its way into mainstream commercial and enterprise accounts. It’s not any one product, but a set of processes and technologies that can crunch through substantial data sets quickly to make complex, often real-time decisions.
As this concept draws increased interest from the enterprise, a skills shortage looms.
The idea behind Cetas is to bridge this gap by making big data analytics easier to consume through software that can ingest, parse and analyze data from various repositories.
“Most organizations do not have the capability to quickly transform their infrastructure or operating frameworks to take advantage of the growing technology capabilities in this area,” said CareCore’s Cunningham.
CareCore has already overhauled its data center infrastructure to support big data analytics in pursuit of better medical decision making, but that kind of undertaking isn’t possible for everybody. That’s where VMware will step in, Cunningham said.
VMware hasn’t provided many details regarding its plans for Cetas beyond a blog post from the office of its CTO. But a blog post by an executive from parent company EMC Corp. offers speculative detail about what might arise from the acquisition, such as potential tie-ins with vFabric Data Director, Cloud Foundry and the vCenter Operations product.
Cetas is yet another application acquisitions that VMware has made in recent years. It previously acquired Web-based email provider Zimbra as well as online PowerPoint-alternative SlideRocket and backup service-provider Mozy.
One VMware user compares this buy to VMware’s 2009 acquisition of SpringSource. That move allowed better visibility into Java virtual machines from the hypervisor, providing better performance for applications developed on the Spring framework, said Bill Hill, infrastructure IT lead for a Portland, Ore.-based logistics company. “Similar memory integrations with Cetas and Hadoop may have similar results.”
These acquisitions may seem far afield for some in enterprise IT, but experts say Cetas is worth paying attention to.
“There’s major pressure inside many organizations to leverage big data and do advanced analytics,” said David Vellante, Wikibon.org founder and analyst. “There’s frustration with traditional models of data warehousing.”
Diving deeper into Cetas
It’s still unclear when products from Cetas might see the light of day under VMware; the company is still in the early stages, having been founded in 2010, and it has just 20 employees. Company officials say the software has “tens of customers” in the Fortune 1000, none of whom were named publicly.
Cetas’s CEO Muddu Sudhakar calls Cetas’s product a “next-generation Splunk,” in that it can analyze business-application data rather than machine data.
So far, Sudhakar said the product has appealed to e-commerce companies and large enterprises looking to do predictive analytics on large datasets to figure out what customers are buying, for example, or which products to recommend to buyers.
“As you collect the data, I’ll tell you what’s happening in your enterprise,” he said.
Cetas’s Instant Intelligence software can ingest multiple types of data from multiple sources, in several ways. It can be pointed at a network file share, running application stream or an existing cloud data repository on Amazon S3. Users can also drag and drop files into a folder, and the software can extract data processed using Hadoop clusters. From there, data dimensions are automatically parsed by Cetas’s software, and filters for the data are automatically created.
For instance, data from a .csv file containing an example gaming data set is automatically extracted and filtered according to dimensions -- such as users, IP address, level number, login and logout time -- as seen in a Cetas demo video. Dimensions can be dragged into graphs, which are automatically plotted; multiple dimensions can be plotted against one another and graphs can be manipulated to drill deeper into data.
The software runs on a virtualized back end, such as VMware’s vSphere or Amazon’s EC2, scaling up or down into more or fewer instances as needed. There is a cloud edition for data sets based in the cloud as well as a downloadable enterprise edition for on-premise data sets behind the firewall.
Beth Pariseau is a senior news writer for SearchServerVirtualization.com and SearchDataCenter.com. Write to her at firstname.lastname@example.org.