Faster, more capable: What Apache Spark brings to Hadoop

Good data scientist hunting – the sexiest job of the 21st century
February 11, 2014
Big data: Keys to the strategic positioning of PR?
February 12, 2014
The Crayon Blog

Faster, more capable: What Apache Spark brings to Hadoop

Tech Articles | Published February 11, 2014  |   Tejeswini Kashyappan

Apache Spark is an execution engine that broadens the type of computing workloads Hadoop can handle, while also tuning the performance of the big data framework.

Hadoop specialist Cloudera recently announced that it will offer commercial support for Apache Spark, which is available as part of Cloudera’s Hadoop-powered Enterprise Data Hub. But why should businesses care about Spark?

Apache Spark has numerous advantages over Hadoop’s MapReduce execution engine, in both the speed with which it carries out batch processing jobs and the wider range of computing workloads it can handle.

Spark is able to execute batch-processing jobs between 10 to 100 times faster than the MapReduce engine according to Cloudera, primarily by reducing the number of writes and reads to disc.

“You have map and reduce tasks and after that there’s a synchronisation barrier and you persist all of the data to disc,” said Mark Grover, Hadoop engineer for Cloudera.

Read More

Subscribe to the Crayon Blog. Get the latest posts in your inbox!

The Crayon Blog

Faster, more capable: What Apache Spark brings to Hadoop

Tech Articles | Published February 11, 2014  |   Tejeswini Kashyappan

Apache Spark is an execution engine that broadens the type of computing workloads Hadoop can handle, while also tuning the performance of the big data framework.

Hadoop specialist Cloudera recently announced that it will offer commercial support for Apache Spark, which is available as part of Cloudera’s Hadoop-powered Enterprise Data Hub. But why should businesses care about Spark?

Apache Spark has numerous advantages over Hadoop’s MapReduce execution engine, in both the speed with which it carries out batch processing jobs and the wider range of computing workloads it can handle.

Spark is able to execute batch-processing jobs between 10 to 100 times faster than the MapReduce engine according to Cloudera, primarily by reducing the number of writes and reads to disc.

“You have map and reduce tasks and after that there’s a synchronisation barrier and you persist all of the data to disc,” said Mark Grover, Hadoop engineer for Cloudera.

Read More

Subscribe to the Crayon Blog. Get the latest posts in your inbox!

The Crayon Blog

Faster, more capable: What Apache Spark brings to Hadoop

Tech Articles | Published February 11, 2014  |   Tejeswini Kashyappan

Apache Spark is an execution engine that broadens the type of computing workloads Hadoop can handle, while also tuning the performance of the big data framework.

Hadoop specialist Cloudera recently announced that it will offer commercial support for Apache Spark, which is available as part of Cloudera’s Hadoop-powered Enterprise Data Hub. But why should businesses care about Spark?

Apache Spark has numerous advantages over Hadoop’s MapReduce execution engine, in both the speed with which it carries out batch processing jobs and the wider range of computing workloads it can handle.

Spark is able to execute batch-processing jobs between 10 to 100 times faster than the MapReduce engine according to Cloudera, primarily by reducing the number of writes and reads to disc.

“You have map and reduce tasks and after that there’s a synchronisation barrier and you persist all of the data to disc,” said Mark Grover, Hadoop engineer for Cloudera.

Read More

Subscribe to the Crayon Blog. Get the latest posts in your inbox!

The Crayon Blog

Faster, more capable: What Apache Spark brings to Hadoop

Tech Articles | Published February 11, 2014  |   Tejeswini Kashyappan

Apache Spark is an execution engine that broadens the type of computing workloads Hadoop can handle, while also tuning the performance of the big data framework.

Hadoop specialist Cloudera recently announced that it will offer commercial support for Apache Spark, which is available as part of Cloudera’s Hadoop-powered Enterprise Data Hub. But why should businesses care about Spark?

Apache Spark has numerous advantages over Hadoop’s MapReduce execution engine, in both the speed with which it carries out batch processing jobs and the wider range of computing workloads it can handle.

Spark is able to execute batch-processing jobs between 10 to 100 times faster than the MapReduce engine according to Cloudera, primarily by reducing the number of writes and reads to disc.

“You have map and reduce tasks and after that there’s a synchronisation barrier and you persist all of the data to disc,” said Mark Grover, Hadoop engineer for Cloudera.

Read More

Subscribe to the Crayon Blog. Get the latest posts in your inbox!