The Boulder BI Brain Trust


Greenplum redefines DW in the Enterprise Data Cloud

| | TrackBacks (0)
greenplum logo.pngGreenplum (GP) has launched an initiative on the Enterprise Data Cloud (EDC) that "virtualizes the analytic infrastructure over a pool of resources and giving users the power, through self-service, to instantiate their own database instances, without affecting other users." Presenting their strategies behind this initiative are: Paul Salazar, VP of Corporate Marketing and Ben Werther, Director of Product Management.

Paul summarizes their direction as pursuing EDC, wherever that might lead! Currently GP has 100 employees, $28M in funding, and 65 customers, such as NYSE, Fox Interactive Media, iCrossing, Bakrie Telecom, Reliance Communications. Since Oracle now owns Sun and Sun is an investor in GP, the issue of Oracle's control of GP was raised. Ben's answer is that the investment is minor and without Board seat, so that Oracle is not involved in confidential matters.

Ben gave an overview into scalable DBMS. Greenplum positioning is that they are unique in commodity everything for the hardware.

GP supports MapReduce, scalable MPP analytics defined by Google. A recent study by Stonebraker and DeWitt showed that MapReduce did not provide performance advantages over current SQL database engines. Ben confirmed this but added that MapReduce is a different paradigm that cater to analytic programmers. To perform similar functionality in SQL is difficult and time-consuming, even though equivalent performance is possible.

Finally we focused on the new EDC initiative, which addresses two problems: handling the ever-exploding scale of data, and the limitations of one big data warehouse. The key to this EDC approach is self-service, which is a shift of power and control to business users. GP asserts that EDW can not move at the speed of the business and actually fosters fragmented data silos. EDC allows the quick and easy provisioning of a new data marts/warehouses, all under IT control. Support the physical consolidation of data and allow its logical unification occur over time. GP is initially focusing on data mart consolidation and project sandboxes, both of which are commanding the attention of corporate IT.

My Take... GP is naive in positioning their EDC as a replacement to the traditional EDW, as a new view of corporate data. Ben countered that GP's goal was not to displace EDW but instead manage the other 90% of data that will not be incorporated into the EDW. This then surfaces issues about the nature of that 90% and benefits/costs of its incorporation (or not) into the EDW. Further, what really is the business justification for "consistent views of business reality" as embodied in the traditional EDW, as opposed to a "single integrated view" or "various divergent views". 

GP is correct in a sense of urgency to extend the scale and usage of corporate data. However, they are currently addressing only a limited solution for advanced analytical users, who are a HUGE important force but in a small segment of progressive corporations. Claudia remarked, "Their current message on EDC was making IT the arms dealer and giving guns to untrained users."

Absent is a vision for the enterprise data architecture that builds upon EDW investment by allowing the unification of information across many sources of data and many degrees of validity. EDC should be positioned as an enhancement or extension to EDW, not its replacement. GP needs to provide unification facilities to balance those provisioning facilities.

EDC is the proper direction, given business challenges and technology advances. However, GP is prematurely launching this direction as a marketing initiative. GP needs the story for IT so that IT will "jump all over it" as Ben stated.

Toward the end of our discussion, Ben presented the technical stack for EDC, as shown in this figure. Note the middle layer for EDC Platform Services...which are the unification facilities needed. Unfortunately GP is not supporting all these components currently but has a roadmap to do so in the future. My feeling is that we are seeing the beginning of a long journey. EDW and Cloud Computing are desirable dancing partners; however, the dance is yet to be choreographed.

Bottom line is that Greenplum is positioning themselves in an exciting new direction. With its heavy technical abilities and progressive leading-edge customers, Greenplum is definitely a player to follow.

0 TrackBacks

Listed below are links to blogs that reference this entry: Greenplum redefines DW in the Enterprise Data Cloud.

TrackBack URL for this entry:



Previous entry: Sand Technology -    Next entry: SaaS BI and BIRST

Find recent content on the main index or look in the archives to find all content.