COMPLEX QUERY: Prefer the account id with the least value in case of same number of unique patients

Note: Prefer the account id with the least value in case of same number of unique patients

Table Name: PATIENT_LOGS

Approach: First convert the date to month format since we need the output specific to each month. Then group together all data based on each month and account id so you get the total no of patients belonging to each account per month basis.

Then rank this data as per no of patients in descending order and account id in ascending order so in case there are same no of patients present under multiple account if then the ranking will prefer the account if with lower value. Finally, choose upto 2 records only per month to arrive at the final output.

SOLUTION:

SELECT ACCOUNT_ID, MONTH, PATINET_PER_MONTH

FROM(
SELECT ACCOUNT_ID,MONTH,PATINET_PER_MONTH,
ROW_NUMBER() OVER (PARTITION BY MONTH ORDER BY ACCOUNT_ID DESC) AS RN
FROM
(
SELECT ACCOUNT_ID,MONTH, COUNT(2) PATINET_PER_MONTH FROM(
select DISTINCT ACCOUNT_ID, PATIENT_ID,TO_CHAR(date1,’MONTH’) MONTH
from patient_logs
)
GROUP BY ACCOUNT_ID,MONTH)
) WHERE RN<3;

 

OUTPUT:

ACCOUNT_ID, MONTH, PATINET_PER_MONTH

3 JANUARY   2
2 JANUARY   3
1 MARCH     1

Popular posts from this blog

Window function in PySpark with Joins example using 2 Dataframes (inner join)

Complex SQL: fetch the users who logged in consecutively 3 or more times (lead perfect example)

Credit Card Data Analysis using PySpark (how to use auto broadcast join after disabling it)