Posts

Showing posts from January, 2021

Java Tutorials

I have written a lot of java tutorials here into many categories. Some of them are in core java tutorial whereas some of them are in J2EE tutorial or Java EE tutorial area. As the number of posts grows, keeping track of them becomes harder. So I have provided a summary post for most of the categories where you can read them in the order for better understanding. Java Tutorials This post is aimed to include all the summary posts in java tutorials that you should go through to have a clear understanding of that area.

Spark Advanced Tutorials (Complete guide book)

Introduction Apache Spark is a general-purpose cluster computing system to process big data workloads. What sets Spark apart from its predecessors, such as MapReduce, is its speed, ease-of-use, and sophisticated analytics. Apache Spark was originally developed at AMPLab, UC Berkeley, in 2009. It was made open source in 2010 under the BSD license and switched to the Apache 2.0 license in 2013. Toward the later part of 2013, the creators of Spark founded Databricks to focus on Spark’s development and future releases. Talking about speed, Spark can achieve sub-second latency on big data workloads. To achieve such low latency, Spark makes use of the memory for storage. In MapReduce, memory is primarily used for actual computation. Spark uses memory both to compute and store objects. Spark also provides a unified runtime connecting to various big data storage sources, such as HDFS, Cassandra, HBase, and S3. It also provides a rich set of higher-level libraries for different big data compute