Hadoop

Fundamantals of Apache Spark…

You can view my other articles on Spark RDD at below links… Apache Spark RDD API using Pyspark… Tips and Tricks for Apache Spark RDD API, Dataframe API How did Spark become so efficient in data processing as compared to MapReduce? It comes with a very advanced Directed Acyclic Graph (DAG) data processing engine. What it means is that for every Spark job, a DAG of tasks is created to be executed by the engine. The DAG in mathematical parlance consists of a set of vertices and directed edges connecting them. The tasks are executed as per the DAG layout. […]