The Boulder BI Brain Trust

 

Aster Data Systems supports DW with MapReduce

| | TrackBacks (0)
Aster logo.jpg Aster Data Systems presented their background and future plans by Steve Wooledge, Director of Marketing and Shawn Kung, Sr. Director of Product Management. The company was founded 2005 by three Stanford doctoral colleagues and were in stealth mode until May 2008. The engineering team is strong with 26 persons, 13 of whom are at the Ph.D.-level. Clients include Akamai, MySpace, Share-This and a few more.

Aster Data Systems focuses on "software-only relational DBMS for frontline data warehousing", striving for "always parallel" processing and for "always on" operations. They argued that their product - Aster nCluster 3.0 - allows smooth incremental scaling to avoid costs in excess capacity.

Steve presented an overview of one of their largest clients - MySpace - having 118M users who generate 7B events in 2-3 TB per day, doing a high-frequency batch load (15 minutes per hour). It takes several thousands servers to support the data flow into the frontline data warehouse, consisting of 100 nodes with 400 TB capacity.

They finally got around to defining Frontline Data Warehouse (FDW). Wow! What a discussion... Aster is essentially arguing to fork the application data inflow, close to the customer-touch applications, as shown below.
Aster Frontline DW2.jpgAs Claudia noted... Is FDW just a BIG operational data store? In addition, there are several intermixed issues. First, what is the scope of the subject areas in the FDW? How does it overlap with the EDW? Second, isn't FDW duplicating the data quality/cleansing processing. And third, the FDW is support rapid feedback back to the applications. This last issue seems to be the business justification for this hybrid approach. 

The unique feature as Aster is their in-database implementation of MapReduce, which is a parallel data flow approach to DBMS. This is a very interesting topic, since it uncovers a paradigm shift beyond SQL. MapReduce allows the application programmer to push their code closer to the data, roughly like a SQL User-Defined Function. But doing so, with widely used languages, like Java, Python and Perl. Thus, very sophisticated analytics can operate directly on the data. A question that bother me was the intellectual property rights surrounding MapReduce? Can anyone comment on this?

I highly recommend to my DW colleagues to read up on MapReduce and, especially, the Clarement Report on Database Research.  

I suggested a marketing slogon "Aster picks up where SQL lets you down". If Aster uses it, they owe me a nickle per usage.

Oh, finally... Watch for an announcement from Aster in a few weeks...

0 TrackBacks

Listed below are links to blogs that reference this entry: Aster Data Systems supports DW with MapReduce.

TrackBack URL for this entry: http://boulderbibraintrust.org/cgi-bin/mt/mt-tb.cgi/35

   

 

Previous entry: Aster Data at the BBBT    Next entry: Actuate Pushing Sustainability Smarts

Find recent content on the main index or look in the archives to find all content.