Garbage Collection Spark runs on the Java Virtual Machine (JVM). Because Spark can store large amounts of data in memory, it has a major reliance on Java’s memory management and garbage collection (GC). Therefore, garbage collection (GC) can be a major issue that can affect many Spark applications. Common symptoms of excessive GC in Spark are: Slowness of application Executor heartbeat timeout GC overhead limit exceeded error Spark’s memory-centric approach and data-intensive applications make it a more common issue than other Java applications. Thankfully, it’s easy to diagnose if your Spark application is suffering from a GC problem. The Spark UI marks executors in red if they have spent too much time doing GC. Spark executors are spending a significant amount of CPU cycles performing garbage collection. This can be determined by looking at the “Executors” tab in the Spark application UI. Spark will mark an executor in red if the executor has spent more than 10% of the time in gar...
From the login_details table, fetch the users who logged in consecutively 3 or more times. Table Name : LOGIN_DETAILS Approach : We need to fetch users who have appeared 3 or more times consecutively in login details table. There is a window function which can be used to fetch data from the following record. Use that window function to compare the user name in current row with user name in the next row and in the row following the next row. If it matches then fetch those records. –Table Structure: drop table login_details; create table login_details( login_id int primary key, user_name varchar(50) not null, login_date date); delete from login_details; insert into login_details values (101, ‘Michael’, current_date), (102, ‘James’, current_date), (103, ‘Stewart’, current_date+1), (104, ‘Stewart’, current_date+1), (105, ‘Stewart’, current_date+1), (106, ‘Michael’, current_date+2), (107, ‘Michael’, cur...