apache-spark-lightning-fast-cluster-computing.md 712 B


Title: "Apache Spark - Lightning-fast cluster computing" Date: 2016-09-01 03:03:11 Categories: [data processing] tags: [spark] Slug: apache-spark-lightning-fast-cluster-computing

Authors: sedlav

Apache Spark™ is a fast and general engine for large-scale data processing.

Speed: Run programs up to 100x faster than Hadoop MapReduce in memory, or 10x faster on disk.

Ease of Use: Write applications quickly in Java, Scala, Python, R.

Generality: Combine SQL, streaming, and complex analytics.

Runs Everywhere: Spark runs on Hadoop, Mesos, standalone, or in the cloud. It can access diverse data sources including HDFS, Cassandra, HBase, and S3.

Link