Covid Data Analysis with Bed availability and other details
Covid Data Analysis: I have provided small sample dataset and run same progam with 10 GB data on cluster with 10 mappers and it took around 25 secs to process data We have added Partioner just to understand how partition is partiioning data and mapper is being assigned to process that particular partition Implemented cache for performance booster Country wise total cases Country wise new cases Country wise other details like available beds, booster details etc for more details please follow below git details: https://github.com/Deepak-Bhardwaj-Architect/CovidDataAnalysis Implementation: package com.citi.covid.spark.driver; import org.apache.spark.SparkConf; import org.apache.spark.api.java.JavaSparkContext; import org.apache.spark.sql.Dataset; import org.apache.spark.sql.Row; import org.apache.spark.sql.SparkSession; public class CovidDataAnalysis { public static void main(String[] args) throws InterruptedException { JavaSparkContext sc = new JavaSparkContext(new SparkConf().setAppName...