Cloudera is playing games, claims Actian SVP


Firing off at Cloudera for smartly modifying the queries while publishing a SQL on Hadoop benchmark comparison, Emma McGrattan, SVP of Engineering at Actian claimed she did not trust benchmarks unless they came from a party like TPC and were audited.

In this particular case that we call as the Impala subset of TPC-DS they identified the queries where they were going to out-perform Hive, they rewrote them because they have limited SQL support and then they added the partition keys. So it's a game - they cheated and I call them on their cheating when I use this presentation publicly because I think it's wrong.

Everybody is playing games with these benchmarks and we can use them to demonstrate whatever you chose. But for us we're using standard SQL. We have clean hands in this and we also plan something for the first half of next year to publish some audited benchmark results.

In a detailed interview published on HadoopSphere, Emmaprovides a complete picture of SQL on Hadoop landscape and what the major players are up to. Claiming that Actian SQL ‘in’ Hadoop offering Vector is 30 times faster than Cloudera Impala, Emma provides a detailed architectural view of their offering. She is candid enough to say that Cloudera has borrowed quite a few ideas from Actian’s solutions. “Now I can’t go into too much detail as to exactly how we've done this because I say a patent is pending and we do believe when we look at what Cloudera is doing in Kudu, they have borrowed a number of the ideas - because up until now we have been talking about this publicly and the Cloudera guys have learned a lot from development work and research work that we have done at Actian.


Read the full interview in 2 part no-holds barred articles at following links. You would like to bookmark this and read it all over again whenever you talk about SQL on Hadoop.