Kognitio on Hadoop Outperforms Spark, Impala in Industry-Standard Benchmarking Tests

Results delivered up to 178x faster, obtaining insights where others cannot; “Kognitio provides broadest range of coverage, performance,” says top industry analyst


SAN JOSE, Calif., March 14, 2017 (GLOBE NEWSWIRE) -- STRATA -- New tests, run using the industry-standard TPC-DS benchmarks, conclusively prove that Kognitio® on Hadoop returned results faster and with greater overall consistency than similar tests run against Big Data SQL engines Spark and Impala.

The results were released at the start of the Strata + Hadoop World conference, being held this week in San Jose. Kognitio is an exhibitor at the conference.

The tests, run last month, returned results showing that Kognitio on Hadoop returned results faster than Spark and Impala in 92 of the 99 TPC-DS tests running a single stream at one terabyte, a starting point for assessing performance. When the queries were increased to ten concurrent streams, Kognitio still delivered, proving faster than its competitors in 80 of the 99 tests. Kognitio on Hadoop was also more reliable; it returned results in the allotted time of one hour or less 96 percent of the time, compared with Spark's 85 percent and Impala's 71 percent.

Speed was also a key consideration: Kognitio on Hadoop returned results up to 178.5 times faster than Spark1, and up to 30.4 times faster than Impala2. After reviewing the raw data, Enterprise Management Associates analyst John Myers said, "The Kognitio platform provides the broadest range of coverage and performance for analytical and business intelligence workloads for organizations implementing big data analytic environments."

The test results also showed that Kognitio was easier to implement; it was able to run each of the TPC-DS queries, 76 of them with no changes needed. By contrast, Spark only ran 72 of the queries "out of the box," and Impala was only able to do so in 55 of the 99 queries. In fact, the tests returned results showing that Impala was not able to support 24 of the queries, a full 25 percent of the total.

The results take on added significance, given Hadoop's growing importance among enterprises. Industry analyst Gartner, quoted in InfoWorld3, said "2016 spend on Hadoop distributions reached $800 million, a 40 percent spike from 2015."

Kognitio has leveraged its worldwide experience in in-memory analytics, stretching back more than a generation, making Kognitio on Hadoop available on a free-to-use basis, without time or capacity restrictions. Kognitio has solved many challenges which competing solutions have not been able to address, such as HOW to run a query in-memory when the data size means that there is insufficient memory.

"The performance and functionality problems associated with existing SQL on Hadoop solutions, has made connecting existing business tools to Hadoop-based data a frustrating experience for business users", said Roger Gaskell, Kognitio's CEO. "By adopting Kognitio on Hadoop, organizations can make the business users' experience a positive one."

Details of the infrastructure utilized for the benchmark tests, along with timings for individual queries across all three platforms can be found on the Kognitio website at: bit.ly/TPC-DS-techinfo.

The full whitepaper can be found at bit.ly/SQL-on-Hadoop-Bench

About Kognitio
For more than a generation, Kognitio has been a pioneer in the development of scale-out, in-memory software for big data analytics. Today, Kognitio software provides an ultra-fast, high concurrency SQL layer that allows modern data visualization tools to maintain interactive performance, even when the data volume is large and the user count high. Kognitio is fully integrated with YARN on Hadoop or can be installed on standalone hardware infrastructure. In either case it can directly process data from a large variety of sources such as Hadoop HDFS (including Parquet and ORC format Hive tables), Amazon S3 or a NAS file system. The software also supports sophisticated NoSQL capabilities enabling scale-out advanced analytics alongside the ultra-fast, fully functional SQL. To learn more, visit www.kognitio.com.

1 Single stream, 1TB scale, query #15: Kognitio, 3.2 seconds; Spark, 571.1 seconds

2 Single stream, 1TB scale, query #93: Kognitio, 2.9 seconds; Impala, 88.3 seconds

3 http://www.infoworld.com/article/3170127/analytics/hadoop-finds-a-happier-home-in-the-cloud.html 


            

Contact Data