The Boulder BI Brain Trust

 

Results tagged “data warehouse appliance” from Boulder BI Brain Trust Blog

XtremeData Discusses Their Analytics Appliance

| | TrackBacks (0)
XD logo.pngFounded in 2004, XtremeData presents their ideas and products about "Big Data" with Ravi Chandran, CEO/Founder and Geno Valente, Vice President of Sales and Marketing. Ravi has a background in parallel processing machine in medical imaging systems. Geno comes from a digital engineering background especially using Field Programmable Gate Arrays (FPGA) for low latency equity trading. The management team has strong connections with Knightsbridge Consulting, now part of HP.

XD BI-DA.pngGeno started with an overview of their dbX analytics appliance. To be successful in the data appliance market, he emphasized that you need three areas of expertise: computer architecture, database engine internals, and domain knowledge of the analytics space. XtremeData has all three. The computing sub-system for one in four medical CT scanners worldwide was created by the engineering team that is currently at XtremeData. He distinguished between Business Intelligence and Data Analysis (Analytics), which causes considerable discussion whether this distinction was useful to the industry. This chasm between the two categories was motivated by the table at the right. (click for full resolution) There should be a (loosely coupled) closed-loop between the Enterprise systems  with the data analysis system, with strategies generated in the analysis impacts the operational systems via application projects.

XD rack.pngWe continued with a technical description of their dbX analytics appliance, priced at $20K per TB of uncompressed user data and scales from 1 TB to 3.8 PB. XD arch.pngTheir architectural differentiation is the heavy reliance on their FPGA chip that performs a broad spectrum of SQL operations (select/filter, partition, join group, aggregate, distribute). Competitors from this respect are Netezza and Kickfire. Ravi argued that they leverage hardware-assist from the bottom levels of the query parse tree to the higher levels, thus achieving high performance. Deep question by Neil about doing a median calculation had Ravi to explain their internal sort mechanism. FPGA chipsallow 10x performance with 1/3 the power consumption. MapReduce functionality is inherent in the dbX architecture but is implemented as User Defined Functions with C/C++, instead as a separate API. In addition, custom functions can be embedded in the FPGA for special customer requests.

XD config.pngAfter the break, we looked at the various product configuration and key industry trends in hardware storage, servers, network, and database engines. Applications for dbX are: bioinformatics to march and find genomic sequences and financial analysis of US consumer credit data. We got into the processing details that showed good load balancing at each stage.

They share the announcement of the partnership with Cray to produce a Personal Data Warehouse (PDW) appliance with 3 nodes of 5 TB of uncompressed user data, deployed in an office environment. KXEN has their software running in this type of box. We then had a good (confidential) discussion on market positioning of the PDW.  

My Take...

One aspect that impressed me about XtremeData is their intellectual property in FPGA technology. They have a patent on using FPGA as a CPU on standard blade servers ... which means that XtremeData can do much more than SQL processing on Big Data. Some applications in massive image recognition are amazing and could revolutionize the business of major corporations. So, XtremeData is a company to know if your company has complex and specialized analytics on Big Data.
paraccel logo.pngWow! big turn out this morning, representatives from the BBBT are tuning in from France, The Netherlands, UK and South Africa not to mention 14 of us here in Boulder and the US. Rick Glick VP of Technology and Architecture and Kim Stanick VP of Marketing have joined us today to help bring us up to date on the new things around ParAccel. For those of you not familiar with ParAccel they are a columnar based MPP database for data warehousing and analytics. 

Kim is kicking things off with an impressive list of clients (sorry NDA can't share the list) The cases and stories are intriguing and include government, big retail, pharma and financial services. The solution is available as software only and totally configured appliance or you can purchase and build it yourself. 75% of ParAccel's clients are either purchasing the software or doing the "build your own" type of appliance approach. 

Version 2.5 of ParAccel is due in the coming months and offers some pretty cool upgrades (sorry NDA again) ParAccel sees that speed continues to lead the way with client needs and opens the door to more innovative analytics.  They leverage query optimization, compiled queries, shared nothing MPP and the power of the columnar database to serve these needs. The company is growing, and has added personnel during the last quarter.

To continue the companies growth Mark Lockareff is now in place as the new CEO of ParAccel. Mark's job is to take the company to the next level and beyond the late stage startup phase. I think this is a a big positive for ParAccel and this type of leadership will help them in what has become a very fast moving and competitive market segment.

It seems that ParAccel is at a tipping point, combined with a new CEO they have a new aggressive marketing campaign staged and ready to go in late February. I think this too is important and a positive for the company because in the past competitors in this segment have made a lot more noise and carried a stronger if not louder message to the market. The next couple of business quarters will tell the story for ParAccel it seems they are well armed for the battle and ready to start the next stage in the companies maturity.

Today's conversation was animated and brisk proof of this can be seen on Twitter under the #BBBT hash tag where we set a record today for the BBBT with over 170 tweets!

Kognitio brings flexibility to complex analytics

| | TrackBacks (0)

The BBBT pondered the past and future of the British firm, Kognitio, which has deep roots into the database community. In 1992 ex-Teradata people formed WhiteCross Systems, which merged with Kognitio in 2005. It currently employs over 70 persons at Bracknell, UK, building a group in Chicago to serve the US market, and has over 30 customers across several industries, like finance, retail, and telecom. We were briefed by Sean Jackson, VP Marketing, John Thompson, EVP and General Manager of US, and Roger Gaskell, Chief Technology Officer.

The focus is high-performance analytics at a low cost. The scalable MMP architecture executes on any collection of x86 blade servers under Linux interconnected with TCP/IP. There is no indexing and no materialized aggregations. The system is said to perform well with high data volumes and high workload concurrency.

A distinctive of Kognitio is that they have a full SQL functional row-oriented database engine that achieves good performance with complex query processing. This runs contrary to industry wisdom that only column-oriented engines can achieve such performance. The magic comes from: spreading data evenly across many nodes, extensive use of in-memory processing, generation of x86 machine code for query processing, mature cost-based optimizer, and smart pipelining of temp data among the nodes. The pipelining reminded me of the old Y-bus unleashed.

Another distinctive of Kognitio is that they have three ways of delivering their product/service. First, they licence their database as a normal software product. Second, they will sell a complete appliance as a hardware/software bundle. And third, they offer data warehousing as a service, hosted in their own data center and third-party data centers.

Now that is flexibility! Check them out! These brits have more to offer than Newcastle Brown.

Kim Stanick VP Marketing and Barry Zane CTO of Paracell are briefing the group today. The underlying theme is "its about time". I agree it is all about the time, time is what drives the need for analytics and time is critical to the value of analytics. The days of waiting 60 hours for a query result are well past us now. ParAccel is addressing the issue and sees the value in solving the time problem. ParAccel bring the following architectural highlights to the time problem.

  • Fully Transactional DBMS
  • Columnar Orientation
  • Adaptive compression
  • Shared-nothing, MPP design
  • Parallel Loader
  • High Availability w/performance
  • CPU-based optimization
  • tightly-coupled grid protocol
Interesting news from the briefing include their recent success with TPC-H benchmarks. Hitting the top of the list in runs for 100GB, 300GB and 1TB this past October. The interesting part for me isn't just the speed but the cost factor, further proof that appliances can impact your analytics from a hard cost perspective. The result of the benchmark puts the scaled queries per hour cost at $4.57 with the ParAccel solution versus the previous records from other companies that cost between $25 to as high as $60 per query hour.

Another great session today with a large turn out from the Boulder BI Brain Trust members.In attendance Ron Powell, Claudia Imhoff, Hans Hultgren, Richard Hackathorn, Lowell Fryman, Holli Arnett, Mike Brooks, Joyce Montanari, Lu Kroeger, John Myers, Steve Dine.

   

 

Find recent content on the main index or look in the archives to find all content.

Tags