SQL Leetcode

LiveRunGrow
28 min readMay 4, 2023

--

Found an online cheatsheet

https://drive.google.com/viewerng/viewer?url=https://learnsql.com/blog/sql-basics-cheat-sheet/sql-basics-cheat-sheet-a4.pdf

Inner Join returns only matching rows based on a specific condition, Cross Join returns all possible combinations of rows from both tables, Full Join returns all rows from both tables and combines matching rows, and Natural Join returns all matching rows based on the same column names and data types.

CROSS JOIN same as join X,Y

CROSS JOINs don’t have ON clauses as everything is joined with everything. FULL OUTER JOIN is a combination of LEFT OUTER and the RIGHT OUTER JOINs.

NATURAL JOIN and INNER JOIN will only return rows that has matching attributes. However, NATURAL Join will return lesser columns (The duplicate columns of the 2 tables will be dispalyed as 1 column) while INNER join will return all columns of both.

SQLs can be helpful in system design interviews or if one is interviewing for a big data engineering position or data science…….

Which type of join is the fastest?

In terms of performance, the choice between a “JOIN” and “NATURAL JOIN” does not directly impact the speed of the operation itself.

Inner Join: Use an inner join when you only need to retrieve rows that have matching values in both tables based on a specified condition. It is the most common type of join and often provides the best performance when properly indexed.

Cross Join: Use a cross join (or Cartesian join) when you want to combine every row from the first table with every row from the second table. It results in a Cartesian product of the two tables and can be resource-intensive if the tables have a large number of rows.

Some other syntaxes

coalesce(sum(Rides.distance), 0)

// If sum(XX) is null, then we just replace it with 0

SELECT IFNULL(NULL, "W3Schools.com");

Return the specified value IF the expression is NULL, otherwise return the expression.

round(x, 2) # round off to 2 decimals
(case 
when Table.column < 2 then 'abc'
when Table.column > 4 then 'def'
else 'ghi'
end) as ABC
xx between 'yyyy-mm-dd' and 'yyyy-mm-dd' // inclusive of start and end
datediff(a.visited_on,b.visited_on) between 0 and 6

// a.visited_on is bigger than b.visited_on
SELECT MIN(column_name)
FROM table_name
WHERE condition;
with A as (

)

// create tables

IFNULL and COALESCE both handle null values in SQL, but IFNULL takes two arguments and returns the first non-NULL value, while COALESCE takes multiple arguments and returns the first non-NULL expression among them.

Only keep the year and month:

DATE_FORMAT(trans_date, "%Y-%m") AS month

1294. Weather Type in Each Country

# Write your MySQL query statement below
select c.country_name, avg_weather.weather_type
from Countries as c
join
(select country_id,
case
when avg(weather_state) <= 15 then 'Cold'
when avg(weather_state) >= 25 then 'Hot'
else 'Warm'
end as weather_type

from Weather
where day between "2019-11-01" AND '2019-11-30'
group by country_id
) as avg_weather
on c.country_id = avg_weather.country_id;

OR

# Write your MySQL query statement below
select Countries.country_name,
(case
when avg(Weather.weather_state) <= 15 then 'Cold'
when avg(Weather.weather_state) >=25 then 'Hot'
else 'Warm'
end) as weather_type
from Countries join Weather
on Countries.country_id = Weather.country_id
where Weather.day between '2019-11-01' and '2019-11-30'
group by Countries.country_name

OR

# Write your MySQL query statement below
with A as (
select country_id, avg(weather_state) as average
from Weather
where day between "2019-11-01" and "2019-11-30" // inclusive
group by country_id
)

select Countries.country_name, (case
when A.average <= 15 then "Cold"
when A.average >= 25 then "Hot"
else "Warm"
end
) as weather_type
from A left join Countries
on A.country_id = Countries.country_id

1440. Evaluate Boolean Expression

select left_operand, operator, right_operand, 
case
when operator = '<' and left_operand_value < right_operand_value then 'true'
when operator = '<' and left_operand_value >= right_operand_value then 'false'
when operator = '>' and left_operand_value > right_operand_value then 'true'
when operator = '>' and left_operand_value <= right_operand_value then 'false'
when operator = '=' and left_operand_value = right_operand_value then 'true'
else 'false'
end as value
from (
select left_operand, operator, right_operand, (select value from Variables where Variables.name = Expressions.left_operand) as left_operand_value, (select value from Variables where Variables.name = Expressions.right_operand) as right_operand_value
from Expressions
) as SubTable

The Trick to thinking is to first create the small subTable…The subTable contains Expression table plus two additional columns (left operand value and right operand value).

With the values, we then attempt to evaluate the expressions by using case…

1321. Restaurant Growth

SQL Schema

Table: Customer

+---------------+---------+
| Column Name | Type |
+---------------+---------+
| customer_id | int |
| name | varchar |
| visited_on | date |
| amount | int |
+---------------+---------+
(customer_id, visited_on) is the primary key for this table.
This table contains data about customer transactions in a restaurant.
visited_on is the date on which the customer with ID (customer_id) has visited the restaurant.
amount is the total paid by a customer.

You are the restaurant owner and you want to analyze a possible expansion (there will be at least one customer every day).

Write an SQL query to compute the moving average of how much the customer paid in a seven days window (i.e., current day + 6 days before). average_amount should be rounded to two decimal places.

Return result table ordered by visited_on in ascending order.

The query result format is in the following example.

Example 1:

Input: 
Customer table:
+-------------+--------------+--------------+-------------+
| customer_id | name | visited_on | amount |
+-------------+--------------+--------------+-------------+
| 1 | Jhon | 2019-01-01 | 100 |
| 2 | Daniel | 2019-01-02 | 110 |
| 3 | Jade | 2019-01-03 | 120 |
| 4 | Khaled | 2019-01-04 | 130 |
| 5 | Winston | 2019-01-05 | 110 |
| 6 | Elvis | 2019-01-06 | 140 |
| 7 | Anna | 2019-01-07 | 150 |
| 8 | Maria | 2019-01-08 | 80 |
| 9 | Jaze | 2019-01-09 | 110 |
| 1 | Jhon | 2019-01-10 | 130 |
| 3 | Jade | 2019-01-10 | 150 |
+-------------+--------------+--------------+-------------+
Output:
+--------------+--------------+----------------+
| visited_on | amount | average_amount |
+--------------+--------------+----------------+
| 2019-01-07 | 860 | 122.86 |
| 2019-01-08 | 840 | 120 |
| 2019-01-09 | 840 | 120 |
| 2019-01-10 | 1000 | 142.86 |
+--------------+--------------+----------------+
Explanation:
1st moving average from 2019-01-01 to 2019-01-07 has an average_amount of (100 + 110 + 120 + 130 + 110 + 140 + 150)/7 = 122.86
2nd moving average from 2019-01-02 to 2019-01-08 has an average_amount of (110 + 120 + 130 + 110 + 140 + 150 + 80)/7 = 120
3rd moving average from 2019-01-03 to 2019-01-09 has an average_amount of (120 + 130 + 110 + 140 + 150 + 80 + 110)/7 = 120
4th moving average from 2019-01-04 to 2019-01-10 has an average_amount of (130 + 110 + 140 + 150 + 80 + 110 + 130 + 150)/7 = 142.86
select a.visited_on, 
sum(b.amount) amount,
round(sum(b.amount)/7,2) average_amount
from (select visited_on, sum(amount) amount from Customer GROUP BY visited_on) a,
(select visited_on, sum(amount) amount from Customer GROUP BY visited_on) b
where datediff(a.visited_on,b.visited_on) between 0 and 6
group by a.visited_on
having count(distinct b.visited_on) = 7;


//OR//
select distinct C1.visited_on, sum(c2.amount) as amount, round(avg(c2.amount),2) as average_amount
from (select visited_on, sum(amount) amount from Customer group by visited_on) C1,
(select visited_on, sum(amount) amount from Customer group by visited_on) C2
where datediff(C1.visited_on,C2.visited_on) between 0 and 6
group by C1.visited_on
having count(distinct C2.visited_on) = 7


// OR

select C1.visited_on, sum(C2.amount) as amount, round(avg(C2.amount), 2) as average_amount
from (select visited_on, sum(amount) as amount from Customer group by visited_on) C1,
(select visited_on, sum(amount) as amount from Customer group by visited_on) C2
where datediff(C1.visited_on,C2.visited_on) between 0 and 6
group by C1.visited_on
having count(distinct C2.visited_on) = 7
order by C1.visited_on

// c1.visited_on is later than c2.visited_on

Explanation of the SQL:

  • Perform joins on the two tables, so we have each row matched with all other rows in the table (If they meet the condition that the visited on is within 7 days before)
  • Then group them by each date on the left table
  • Make sure that we have size of 7 for each group. Otherwise, exclude them because it means the particular date on the left side does not meet the criteria of having at least 6 days before.

I found a pretty good solution here and copied it:

1378. Replace Employee ID With The Unique Identifier

SQL Schema

Table: Employees

+---------------+---------+
| Column Name | Type |
+---------------+---------+
| id | int |
| name | varchar |
+---------------+---------+
id is the primary key for this table.
Each row of this table contains the id and the name of an employee in a company.

Table: EmployeeUNI

+---------------+---------+
| Column Name | Type |
+---------------+---------+
| id | int |
| unique_id | int |
+---------------+---------+
(id, unique_id) is the primary key for this table.
Each row of this table contains the id and the corresponding unique id of an employee in the company.

Write an SQL query to show the unique ID of each user, If a user does not have a unique ID replace just show null.

Return the result table in any order.

The query result format is in the following example.

Example 1:

Input: 
Employees table:
+----+----------+
| id | name |
+----+----------+
| 1 | Alice |
| 7 | Bob |
| 11 | Meir |
| 90 | Winston |
| 3 | Jonathan |
+----+----------+
EmployeeUNI table:
+----+-----------+
| id | unique_id |
+----+-----------+
| 3 | 1 |
| 11 | 2 |
| 90 | 3 |
+----+-----------+
Output:
+-----------+----------+
| unique_id | name |
+-----------+----------+
| null | Alice |
| null | Bob |
| 2 | Meir |
| 3 | Winston |
| 1 | Jonathan |
+-----------+----------+
Explanation:
Alice and Bob do not have a unique ID, We will show null instead.
The unique ID of Meir is 2.
The unique ID of Winston is 3.
The unique ID of Jonathan is 1.
select EmployeeUNI.unique_id, Employees.name
from Employees left join EmployeeUNI
on Employees.id = EmployeeUNI.id

Use left join so we have rows even for those rows without any match…

1407. Top Travellers

Table: Users

+---------------+---------+
| Column Name | Type |
+---------------+---------+
| id | int |
| name | varchar |
+---------------+---------+
id is the primary key for this table.
name is the name of the user.

Table: Rides

+---------------+---------+
| Column Name | Type |
+---------------+---------+
| id | int |
| user_id | int |
| distance | int |
+---------------+---------+
id is the primary key for this table.
user_id is the id of the user who traveled the distance "distance".

Write an SQL query to report the distance traveled by each user.

Return the result table ordered by travelled_distance in descending order, if two or more users traveled the same distance, order them by their name in ascending order.

The query result format is in the following example.

Example 1:

Input: 
Users table:
+------+-----------+
| id | name |
+------+-----------+
| 1 | Alice |
| 2 | Bob |
| 3 | Alex |
| 4 | Donald |
| 7 | Lee |
| 13 | Jonathan |
| 19 | Elvis |
+------+-----------+
Rides table:
+------+----------+----------+
| id | user_id | distance |
+------+----------+----------+
| 1 | 1 | 120 |
| 2 | 2 | 317 |
| 3 | 3 | 222 |
| 4 | 7 | 100 |
| 5 | 13 | 312 |
| 6 | 19 | 50 |
| 7 | 7 | 120 |
| 8 | 19 | 400 |
| 9 | 7 | 230 |
+------+----------+----------+
Output:
+----------+--------------------+
| name | travelled_distance |
+----------+--------------------+
| Elvis | 450 |
| Lee | 450 |
| Bob | 317 |
| Jonathan | 312 |
| Alex | 222 |
| Alice | 120 |
| Donald | 0 |
+----------+--------------------+
Explanation:
Elvis and Lee traveled 450 miles, Elvis is the top traveler as his name is alphabetically smaller than Lee.
Bob, Jonathan, Alex, and Alice have only one ride and we just order them by the total distances of the ride.
Donald did not have any rides, the distance traveled by him is 0.

select Users.name, coalesce(sum(Rides.distance), 0) as travelled_distance
from Users left join Rides
on Users.id = Rides.user_id
group by Rides.user_id // There might be a case where 2 users have the same name
order by travelled_distance desc, Users.name

176. Second Highest Salary

Table: Employee

+-------------+------+
| Column Name | Type |
+-------------+------+
| id | int |
| salary | int |
+-------------+------+
id is the primary key column for this table.
Each row of this table contains information about the salary of an employee.

Write an SQL query to report the second highest salary from the Employee table. If there is no second highest salary, the query should report null.

The query result format is in the following example.

Example 1:

Input: 
Employee table:
+----+--------+
| id | salary |
+----+--------+
| 1 | 100 |
| 2 | 200 |
| 3 | 300 |
+----+--------+
Output:
+---------------------+
| SecondHighestSalary |
+---------------------+
| 200 |
+---------------------+

Example 2:

Input: 
Employee table:
+----+--------+
| id | salary |
+----+--------+
| 1 | 100 |
+----+--------+
Output:
+---------------------+
| SecondHighestSalary |
+---------------------+
| null |
+---------------------+

select ifnull (
(select distinct e2.salary
from Employee e2
order by e2.salary desc
limit 1 offset 1),
null // if null, we still want to produce null.
) as SecondHighestSalary

The following does not work

select coalesce(salary, null) as SecondHighestSalary
from Employee
order by salary desc
limit 1
offset 1

it doesn’t handle the case where there is no second highest salary. The COALESCE function is used to select the first non-null value from a list of expressions. However, in the test case below, there is no second highest salary…hence coalesce does not serve a purpose…In fact, for the test case below, there is no row produced. null is not produced.

| id | salary |
| -- | ------ |
| 1 | 100 |

180. Consecutive Numbers

Table: Logs

+-------------+---------+
| Column Name | Type |
+-------------+---------+
| id | int |
| num | varchar |
+-------------+---------+
id is the primary key for this table.
id is an autoincrement column.

Write an SQL query to find all numbers that appear at least three times consecutively.

Return the result table in any order.

The query result format is in the following example.

Example 1:

Input: 
Logs table:
+----+-----+
| id | num |
+----+-----+
| 1 | 1 |
| 2 | 1 |
| 3 | 1 |
| 4 | 2 |
| 5 | 1 |
| 6 | 2 |
| 7 | 2 |
+----+-----+
Output:
+-----------------+
| ConsecutiveNums |
+-----------------+
| 1 |
+-----------------+
Explanation: 1 is the only number that appears consecutively for at least three times.
SELECT DISTINCT
l1.Num AS ConsecutiveNums
FROM
Logs l1,
Logs l2,
Logs l3
WHERE
l1.Id = l2.Id - 1
AND l2.Id = l3.Id - 1
AND l1.Num = l2.Num
AND l2.Num = l3.Num
;

184. Department Highest Salary

Table: Employee

+--------------+---------+
| Column Name | Type |
+--------------+---------+
| id | int |
| name | varchar |
| salary | int |
| departmentId | int |
+--------------+---------+
id is the primary key column for this table.
departmentId is a foreign key of the ID from the Department table.
Each row of this table indicates the ID, name, and salary of an employee. It also contains the ID of their department.

Table: Department

+-------------+---------+
| Column Name | Type |
+-------------+---------+
| id | int |
| name | varchar |
+-------------+---------+
id is the primary key column for this table. It is guaranteed that department name is not NULL.
Each row of this table indicates the ID of a department and its name.

Write an SQL query to find employees who have the highest salary in each of the departments.

Return the result table in any order.

The query result format is in the following example.

Example 1:

Input: 
Employee table:
+----+-------+--------+--------------+
| id | name | salary | departmentId |
+----+-------+--------+--------------+
| 1 | Joe | 70000 | 1 |
| 2 | Jim | 90000 | 1 |
| 3 | Henry | 80000 | 2 |
| 4 | Sam | 60000 | 2 |
| 5 | Max | 90000 | 1 |
+----+-------+--------+--------------+
Department table:
+----+-------+
| id | name |
+----+-------+
| 1 | IT |
| 2 | Sales |
+----+-------+
Output:
+------------+----------+--------+
| Department | Employee | Salary |
+------------+----------+--------+
| IT | Jim | 90000 |
| Sales | Henry | 80000 |
| IT | Max | 90000 |
+------------+----------+--------+
Explanation: Max and Jim both have the highest salary in the IT department and Henry has the highest salary in the Sales department.
select Department.name as Department, Employee.name as Employee, Employee.salary
from Employee join Department
on Employee.departmentId = Department.id
where (Employee.departmentId, Employee.salary) IN (
select departmentId as k, max(salary) as salary
from Employee
group by departmentId
)

178. Rank Scores

Table: Scores

+-------------+---------+
| Column Name | Type |
+-------------+---------+
| id | int |
| score | decimal |
+-------------+---------+
id is the primary key for this table.
Each row of this table contains the score of a game. Score is a floating point value with two decimal places.

Write an SQL query to rank the scores. The ranking should be calculated according to the following rules:

  • The scores should be ranked from the highest to the lowest.
  • If there is a tie between two scores, both should have the same ranking.
  • After a tie, the next ranking number should be the next consecutive integer value. In other words, there should be no holes between ranks.

Return the result table ordered by score in descending order.

The query result format is in the following example.

Example 1:

Input: 
Scores table:
+----+-------+
| id | score |
+----+-------+
| 1 | 3.50 |
| 2 | 3.65 |
| 3 | 4.00 |
| 4 | 3.85 |
| 5 | 4.00 |
| 6 | 3.65 |
+----+-------+
Output:
+-------+------+
| score | rank |
+-------+------+
| 4.00 | 1 |
| 4.00 | 1 |
| 3.85 | 2 |
| 3.65 | 3 |
| 3.65 | 3 |
| 3.50 | 4 |
+-------+------+
select S.score, count(distinct T.score) as 'rank'
from Scores S inner join Scores T
on S.score <= T.score
group by S.id, S.score
order by S.score DESC


// OR
// Note that rank is keyword
select S1.score, (select count(distinct s2.score) + 1 from Scores S2 where S2.score > S1.score) as 'rank'
from Scores S1
order by S1.score desc

534. Game Play Analysis III

Table: Activity

+--------------+---------+
| Column Name | Type |
+--------------+---------+
| player_id | int |
| device_id | int |
| event_date | date |
| games_played | int |
+--------------+---------+
(player_id, event_date) is the primary key of this table.
This table shows the activity of players of some games.
Each row is a record of a player who logged in and played a number
of games (possibly 0) before logging out on someday using some device.

Write an SQL query to report for each player and date, how many games played so far by the player. That is, the total number of games played by the player until that date. Check the example for clarity.

Return the result table in any order.

The query result format is in the following example.

Example 1:

Input: 
Activity table:
+-----------+-----------+------------+--------------+
| player_id | device_id | event_date | games_played |
+-----------+-----------+------------+--------------+
| 1 | 2 | 2016-03-01 | 5 |
| 1 | 2 | 2016-05-02 | 6 |
| 1 | 3 | 2017-06-25 | 1 |
| 3 | 1 | 2016-03-02 | 0 |
| 3 | 4 | 2018-07-03 | 5 |
+-----------+-----------+------------+--------------+
Output:
+-----------+------------+---------------------+
| player_id | event_date | games_played_so_far |
+-----------+------------+---------------------+
| 1 | 2016-03-01 | 5 |
| 1 | 2016-05-02 | 11 |
| 1 | 2017-06-25 | 12 |
| 3 | 2016-03-02 | 0 |
| 3 | 2018-07-03 | 5 |
+-----------+------------+---------------------+
Explanation:
For the player with id 1, 5 + 6 = 11 games played by 2016-05-02, and 5 + 6 + 1 = 12 games played by 2017-06-25.
For the player with id 3, 0 + 5 = 5 games played by 2018-07-03.
Note that for each player we only care about the days when the player logged in.
select A1.player_id, A1.event_date, sum(A2.games_played) as games_played_so_far
from Activity A1 inner join Activity A2 on A1.player_id = A2.player_id and A2.event_date <= A1.event_date
group by A1.player_id, A1.event_date;

550. Game Play Analysis IV

Table: Activity

+--------------+---------+
| Column Name | Type |
+--------------+---------+
| player_id | int |
| device_id | int |
| event_date | date |
| games_played | int |
+--------------+---------+
(player_id, event_date) is the primary key of this table.
This table shows the activity of players of some games.
Each row is a record of a player who logged in and played a number of games (possibly 0) before logging out on someday using some device.

Write an SQL query to report the fraction of players that logged in again on the day after the day they first logged in, rounded to 2 decimal places. In other words, you need to count the number of players that logged in for at least two consecutive days starting from their first login date, then divide that number by the total number of players.

The query result format is in the following example.

Example 1:

Input: 
Activity table:
+-----------+-----------+------------+--------------+
| player_id | device_id | event_date | games_played |
+-----------+-----------+------------+--------------+
| 1 | 2 | 2016-03-01 | 5 |
| 1 | 2 | 2016-03-02 | 6 |
| 2 | 3 | 2017-06-25 | 1 |
| 3 | 1 | 2016-03-02 | 0 |
| 3 | 4 | 2018-07-03 | 5 |
+-----------+-----------+------------+--------------+
Output:
+-----------+
| fraction |
+-----------+
| 0.33 |
+-----------+
Explanation:
Only the player with id 1 logged back in after the first day he had logged in so the answer is 1/3 = 0.33
select round(count(distinct A1.player_id) / (select count(distinct A3.player_id) from Activity A3), 2) as fraction
from Activity A1 join Activity A2
on A1.player_id = A2.player_id
where A1.event_date = (A2.event_date - 1) and A1.event_date = (select min(A4.event_date) from Activity A4 where A4.player_id = A1.player_id)
# The second condition is to make sure A1.event_date is the first date that the user logs in on

1098. Unpopular Books

Table: Books

+----------------+---------+
| Column Name | Type |
+----------------+---------+
| book_id | int |
| name | varchar |
| available_from | date |
+----------------+---------+
book_id is the primary key of this table.

Table: Orders

+----------------+---------+
| Column Name | Type |
+----------------+---------+
| order_id | int |
| book_id | int |
| quantity | int |
| dispatch_date | date |
+----------------+---------+
order_id is the primary key of this table.
book_id is a foreign key to the Books table.

Write an SQL query that reports the books that have sold less than 10 copies in the last year, excluding books that have been available for less than one month from today. Assume today is 2019-06-23.

Return the result table in any order.

The query result format is in the following example.

Example 1:

Input: 
Books table:
+---------+--------------------+----------------+
| book_id | name | available_from |
+---------+--------------------+----------------+
| 1 | "Kalila And Demna" | 2010-01-01 |
| 2 | "28 Letters" | 2012-05-12 |
| 3 | "The Hobbit" | 2019-06-10 |
| 4 | "13 Reasons Why" | 2019-06-01 |
| 5 | "The Hunger Games" | 2008-09-21 |
+---------+--------------------+----------------+
Orders table:
+----------+---------+----------+---------------+
| order_id | book_id | quantity | dispatch_date |
+----------+---------+----------+---------------+
| 1 | 1 | 2 | 2018-07-26 |
| 2 | 1 | 1 | 2018-11-05 |
| 3 | 3 | 8 | 2019-06-11 |
| 4 | 4 | 6 | 2019-06-05 |
| 5 | 4 | 5 | 2019-06-20 |
| 6 | 5 | 9 | 2009-02-02 |
| 7 | 5 | 8 | 2010-04-13 |
+----------+---------+----------+---------------+
Output:
+-----------+--------------------+
| book_id | name |
+-----------+--------------------+
| 1 | "Kalila And Demna" |
| 2 | "28 Letters" |
| 5 | "The Hunger Games" |
+-----------+--------------------+
select Books.book_id, Books.name
from Books left join (select * from Orders where Orders.dispatch_date >= '2018-06-23' and Orders.dispatch_date <= '2019-06-23') as O2
on O2.book_id=Books.book_id
where Books.available_from < '2019-05-23'
group by Books.book_id
having IFNULL(sum(O2.quantity), 0) < 10

Need to filter Orders first to only contain order of the recent year.

This filter cannot be put in the where clause because it will result in Books that do not have Orders in this time period to be omitted from result. We still want them to be displayed as long as they are available from xxx.

We only want this filter to be omitted from the computation of sum in the having clause. Hence, we put this filter in the Join section.

Since we use left-join, this means that even when a book does not have dispatch period within 1 year, it will still be shown in the result.

1107. New Users Daily Count

SQL Schema

Table: Traffic

+---------------+---------+
| Column Name | Type |
+---------------+---------+
| user_id | int |
| activity | enum |
| activity_date | date |
+---------------+---------+
There is no primary key for this table, it may have duplicate rows.
The activity column is an ENUM type of ('login', 'logout', 'jobs', 'groups', 'homepage').

Write an SQL query to report for every date within at most 90 days from today, the number of users that logged in for the first time on that date. Assume today is 2019-06-30.

Return the result table in any order.

The query result format is in the following example.

Example 1:

Input: 
Traffic table:
+---------+----------+---------------+
| user_id | activity | activity_date |
+---------+----------+---------------+
| 1 | login | 2019-05-01 |
| 1 | homepage | 2019-05-01 |
| 1 | logout | 2019-05-01 |
| 2 | login | 2019-06-21 |
| 2 | logout | 2019-06-21 |
| 3 | login | 2019-01-01 |
| 3 | jobs | 2019-01-01 |
| 3 | logout | 2019-01-01 |
| 4 | login | 2019-06-21 |
| 4 | groups | 2019-06-21 |
| 4 | logout | 2019-06-21 |
| 5 | login | 2019-03-01 |
| 5 | logout | 2019-03-01 |
| 5 | login | 2019-06-21 |
| 5 | logout | 2019-06-21 |
+---------+----------+---------------+
Output:
+------------+-------------+
| login_date | user_count |
+------------+-------------+
| 2019-05-01 | 1 |
| 2019-06-21 | 2 |
+------------+-------------+
Explanation:
Note that we only care about dates with non zero user count.
The user with id 5 first logged in on 2019-03-01 so he's not counted on 2019-06-21.
select newTable.login_date, count(newTable.user_id) as user_count
from
(select user_id, min(activity_date) as login_date
from Traffic
where activity = 'login'
group by user_id)
as newTable
where datediff('2019-06-30', newTable.login_date) <= 90
group by newTable.login_date

1112. Highest Grade For Each Student

Table: Enrollments

+---------------+---------+
| Column Name | Type |
+---------------+---------+
| student_id | int |
| course_id | int |
| grade | int |
+---------------+---------+
(student_id, course_id) is the primary key of this table.
grade is never NULL.

Write a SQL query to find the highest grade with its corresponding course for each student. In case of a tie, you should find the course with the smallest course_id.

Return the result table ordered by student_id in ascending order.

The query result format is in the following example.

Example 1:

Input: 
Enrollments table:
+------------+-------------------+
| student_id | course_id | grade |
+------------+-----------+-------+
| 2 | 2 | 95 |
| 2 | 3 | 95 |
| 1 | 1 | 90 |
| 1 | 2 | 99 |
| 3 | 1 | 80 |
| 3 | 2 | 75 |
| 3 | 3 | 82 |
+------------+-----------+-------+
Output:
+------------+-------------------+
| student_id | course_id | grade |
+------------+-----------+-------+
| 1 | 2 | 99 |
| 2 | 2 | 95 |
| 3 | 3 | 82 |
+------------+-----------+-------+
# select E1.student_id, E1.course_id, E1.grade
# from Enrollments E1
# where E1.grade > ALL(
# select E2.grade
# from Enrollments E2
# where E2.student_id = E1.student_id
# and E1.course_id <> E2.course_id
# )
# order by E1.student_id
# THe above does not work because in the case where we have more than 2 courses
# with the same max score there won't be a row for it..

SELECT student_id, MIN(course_id) AS course_id, grade
FROM Enrollments
WHERE (student_id, grade) IN
(
SELECT student_id, MAX(grade)
FROM Enrollments
GROUP BY student_id
)
GROUP BY student_id
ORDER BY student_id ASC

175. Combine Two Tables

SQL Schema

Table: Person

+-------------+---------+
| Column Name | Type |
+-------------+---------+
| personId | int |
| lastName | varchar |
| firstName | varchar |
+-------------+---------+
personId is the primary key column for this table.
This table contains information about the ID of some persons and their first and last names.

Table: Address

+-------------+---------+
| Column Name | Type |
+-------------+---------+
| addressId | int |
| personId | int |
| city | varchar |
| state | varchar |
+-------------+---------+
addressId is the primary key column for this table.
Each row of this table contains information about the city and state of one person with ID = PersonId.

Write an SQL query to report the first name, last name, city, and state of each person in the Person table. If the address of a personId is not present in the Address table, report null instead.

Return the result table in any order.

The query result format is in the following example.

Example 1:

Input: 
Person table:
+----------+----------+-----------+
| personId | lastName | firstName |
+----------+----------+-----------+
| 1 | Wang | Allen |
| 2 | Alice | Bob |
+----------+----------+-----------+
Address table:
+-----------+----------+---------------+------------+
| addressId | personId | city | state |
+-----------+----------+---------------+------------+
| 1 | 2 | New York City | New York |
| 2 | 3 | Leetcode | California |
+-----------+----------+---------------+------------+
Output:
+-----------+----------+---------------+----------+
| firstName | lastName | city | state |
+-----------+----------+---------------+----------+
| Allen | Wang | Null | Null |
| Bob | Alice | New York City | New York |
+-----------+----------+---------------+----------+
Explanation:
There is no address in the address table for the personId = 1 so we return null in their city and state.
addressId = 1 contains information about the address of personId = 2.
# Write your MySQL query statement below

select Person.firstName, Person.lastName, Address.city, Address.state
from Person left join Address
on Person.personId = Address.personId

// Both works, but coalesce is unnecessary
select Person.firstName, Person.lastName, coalesce(Address.city, null) as city, coalesce(Address.state, null) as state
from Person left join Address
on Person.personId = Address.personId

570. Managers with at Least 5 Direct Reports

Table: Employee

+-------------+---------+
| Column Name | Type |
+-------------+---------+
| id | int |
| name | varchar |
| department | varchar |
| managerId | int |
+-------------+---------+
id is the primary key column for this table.
Each row of this table indicates the name of an employee, their department, and the id of their manager.
If managerId is null, then the employee does not have a manager.
No employee will be the manager of themself.

Write an SQL query to report the managers with at least five direct reports.

Return the result table in any order.

The query result format is in the following example.

Example 1:

Input: 
Employee table:
+-----+-------+------------+-----------+
| id | name | department | managerId |
+-----+-------+------------+-----------+
| 101 | John | A | None |
| 102 | Dan | A | 101 |
| 103 | James | A | 101 |
| 104 | Amy | A | 101 |
| 105 | Anne | A | 101 |
| 106 | Ron | B | 101 |
+-----+-------+------------+-----------+
Output:
+------+
| name |
+------+
| John |
+------+
select E1.name
from Employee E1
where (
select count(*)
from Employee E2
where E2.managerId = E1.id
) >= 5

574. Winning Candidate

Table: Candidate

+-------------+----------+
| Column Name | Type |
+-------------+----------+
| id | int |
| name | varchar |
+-------------+----------+
id is the primary key column for this table.
Each row of this table contains information about the id and the name of a candidate.

Table: Vote

+-------------+------+
| Column Name | Type |
+-------------+------+
| id | int |
| candidateId | int |
+-------------+------+
id is an auto-increment primary key.
candidateId is a foreign key to id from the Candidate table.
Each row of this table determines the candidate who got the ith vote in the elections.

Write an SQL query to report the name of the winning candidate (i.e., the candidate who got the largest number of votes).

The test cases are generated so that exactly one candidate wins the elections.

The query result format is in the following example.

Example 1:

Input: 
Candidate table:
+----+------+
| id | name |
+----+------+
| 1 | A |
| 2 | B |
| 3 | C |
| 4 | D |
| 5 | E |
+----+------+
Vote table:
+----+-------------+
| id | candidateId |
+----+-------------+
| 1 | 2 |
| 2 | 4 |
| 3 | 3 |
| 4 | 2 |
| 5 | 5 |
+----+-------------+
Output:
+------+
| name |
+------+
| B |
+------+
Explanation:
Candidate B has 2 votes. Candidates C, D, and E have 1 vote each.
The winner is candidate B.

with V3 as (
select V2.candidateId, count(*) as count
from Vote V2
group by V2.candidateId
)

select C.name
from Candidate as C
where C.id in (
select V3.candidateId
from V3
where V3.count = (
select max(count)
from V3
)
)

578. Get Highest Answer Rate Question

Table: SurveyLog

+-------------+------+
| Column Name | Type |
+-------------+------+
| id | int |
| action | ENUM |
| question_id | int |
| answer_id | int |
| q_num | int |
| timestamp | int |
+-------------+------+
There is no primary key for this table. It may contain duplicates.
action is an ENUM of the type: "show", "answer", or "skip".
Each row of this table indicates the user with ID = id has taken an action with the question question_id at time timestamp.
If the action taken by the user is "answer", answer_id will contain the id of that answer, otherwise, it will be null.
q_num is the numeral order of the question in the current session.

The answer rate for a question is the number of times a user answered the question by the number of times a user showed the question.

Write an SQL query to report the question that has the highest answer rate. If multiple questions have the same maximum answer rate, report the question with the smallest question_id.

The query result format is in the following example.

Example 1:

Input: 
SurveyLog table:
+----+--------+-------------+-----------+-------+-----------+
| id | action | question_id | answer_id | q_num | timestamp |
+----+--------+-------------+-----------+-------+-----------+
| 5 | show | 285 | null | 1 | 123 |
| 5 | answer | 285 | 124124 | 1 | 124 |
| 5 | show | 369 | null | 2 | 125 |
| 5 | skip | 369 | null | 2 | 126 |
+----+--------+-------------+-----------+-------+-----------+
Output:
+------------+
| survey_log |
+------------+
| 285 |
+------------+
Explanation:
Question 285 was showed 1 time and answered 1 time. The answer rate of question 285 is 1.0
Question 369 was showed 1 time and was not answered. The answer rate of question 369 is 0.0
Question 285 has the highest answer rate.
SELECT question_id survey_log 
FROM SurveyLog
GROUP BY question_id
ORDER BY COUNT(answer_id)/COUNT(question_id) DESC, question_id
LIMIT 1

1193. Monthly Transactions I

Table: Transactions

+---------------+---------+
| Column Name | Type |
+---------------+---------+
| id | int |
| country | varchar |
| state | enum |
| amount | int |
| trans_date | date |
+---------------+---------+
id is the primary key of this table.
The table has information about incoming transactions.
The state column is an enum of type ["approved", "declined"].

Write an SQL query to find for each month and country, the number of transactions and their total amount, the number of approved transactions and their total amount.

Return the result table in any order.

The query result format is in the following example.

Example 1:

Input: 
Transactions table:
+------+---------+----------+--------+------------+
| id | country | state | amount | trans_date |
+------+---------+----------+--------+------------+
| 121 | US | approved | 1000 | 2018-12-18 |
| 122 | US | declined | 2000 | 2018-12-19 |
| 123 | US | approved | 2000 | 2019-01-01 |
| 124 | DE | approved | 2000 | 2019-01-07 |
+------+---------+----------+--------+------------+
Output:
+----------+---------+-------------+----------------+--------------------+-----------------------+
| month | country | trans_count | approved_count | trans_total_amount | approved_total_amount |
+----------+---------+-------------+----------------+--------------------+-----------------------+
| 2018-12 | US | 2 | 1 | 3000 | 1000 |
| 2019-01 | US | 1 | 1 | 2000 | 2000 |
| 2019-01 | DE | 1 | 1 | 2000 | 2000 |
+----------+---------+-------------+----------------+--------------------+-----------------------+
# # Write your MySQL query statement below

SELECT DATE_FORMAT(trans_date, "%Y-%m") AS month,
country,
COUNT(id) AS trans_count,
COUNT(CASE WHEN state = "approved" THEN id END) AS approved_count,
SUM(amount) AS trans_total_amount,
coalesce(SUM(CASE WHEN state = "approved" THEN amount END),0) AS approved_total_amount
FROM Transactions
GROUP BY month, country



# with Approved as (
# select DATE_FORMAT(trans_date, "%Y-%m") as month, sum(amount) as approved_amount, null as declined_amount, t.*
# from Transactions t
# where state = "approved"
# )

# with Declined as (
# select DATE_FORMAT(trans_date, "%Y-%m") as month, null as approved_amount, sum(amount) as declined_amount, t.*
# from Transactions t
# where state = "declined"
# )

# with JoinedTable as (
# Approved
# union
# Declined
# )

# select month, country, count(id) as trans_count, count(approved_amount) as approved_count, sum(amount) as trans_total_amount, sum(approved_amount) as approved_total_amount
# from JoinedTable
# group by month, country
# The above query is wrong becos we cannot define >1 CTE

Design a database schema for storing user information.

name, age, address.

Address could be home, office or school.

Individuals may share the same home or school address

icbc

--

--

LiveRunGrow

𓆉︎ 𝙳𝚛𝚎𝚊𝚖𝚎𝚛 🪴𝙲𝚛𝚎𝚊𝚝𝚘𝚛 👩‍💻𝚂𝚘𝚏𝚝𝚠𝚊𝚛𝚎 𝚎𝚗𝚐𝚒𝚗𝚎𝚎𝚛 ☻ I write & reflect weekly about software engineering, my life and books. Ŧ๏ɭɭ๏ฬ ๓є!