Complex SQL: fetch all the records when London had extremely cold temperature for 3 consecutive days or more

From the weather table, fetch all the records when London had extremely cold temperature for 3 consecutive days or more

 

Note: Weather is considered to be extremely cold when its temperature is less than zero.

Table Name: WEATHER

Approach: First using a sub query identify all the records where the temperature was very cold and then use a main query to fetch only the records returned as very cold from the sub query. You will not only need to compare the records following the current row but also need to compare the records preceding the current row. And may also need to compare rows preceding and following the current row. Identify a window function which can do this comparison pretty easily.

 

SOLUTION:

SELECT * FROM(
SELECT ID, CITY, TEMPERATURE, DAY,
CASE WHEN TEMPERATURE<0 AND TEMPERATURE > LEAD(TEMPERATURE) OVER (ORDER BY ID) AND
TEMPERATURE > LEAD(TEMPERATURE,2) OVER (ORDER BY ID) THEN ‘YES’
WHEN TEMPERATURE<0 AND TEMPERATURE < LAG(TEMPERATURE) OVER (ORDER BY ID) AND
TEMPERATURE > LEAD(TEMPERATURE) OVER (ORDER BY ID) THEN ‘YES’
WHEN TEMPERATURE<0 AND TEMPERATURE < LAG(TEMPERATURE) OVER (ORDER BY ID) AND
TEMPERATURE < LAG(TEMPERATURE,2) OVER (ORDER BY ID) THEN ‘YES’ ELSE NULL
END AS CONSIQUITIVE_FLAG
FROM WEATHER) WHERE CONSIQUITIVE_FLAG=’YES’;

OUTPUT:

ID, CITY, TEMPERATURE, DAY, CONSIQUITIVE_FLAG

5 London -2 05-01-21 YES

6 London -5 06-01-21 YES
7 London -7 07-01-21 YES

Popular posts from this blog

Window function in PySpark with Joins example using 2 Dataframes (inner join)

Complex SQL: fetch the users who logged in consecutively 3 or more times (lead perfect example)

Credit Card Data Analysis using PySpark (how to use auto broadcast join after disabling it)