Spark scenario based questions

consider you have a 40 node Spark cluster (32 cores X 128 GB)

you are processing roughly 3.75 TB Data

The processing involves filtering, aggregations, joins etc.

you are getting out of memory error when you run your spark job.

Question 1:
===========

What could be all possible reasons for out of memory errors?

Question 2:
==========

what are the ways to identify the exact issue?

Question 3:
==========

what are the ways to fix it?

Tech Studio Online