写在前面
预备知识
联结(join)
SQL最强大的功能之一就是能在数据检索查询的执行中联结(join) 表。联结是利用SQL的SELECT能执行的最重要的操作,很好地理解联结 及其语法是学习SQL的一个极为重要的组成部分。 – MySQL 必知必会
- 外键(foreignkey) 外键为某个表中的一列,它包含另一个表 的主键值,定义了两个表之间的关系。
- 联结是一种机制,用来在一条SELECT 语句中关联表,因此称之为联结。使用特殊的语法,可以联结多个表返 回一组输出,联结在运行时关联表中正确的行。
- 通过相等的值连接两个表的方法为等值联结(equijoin),它基于两个表之间的 相等测试。这种联结也称为内部联结。
典型题目
- 1251. 平均售价;(先计算然后作为子查询结果获得最终结果)
排名/最值问题
SELECT ifnull((SELECT DISTINCT salary
FROM Employee
ORDER BY salary DESC
LIMIT 1 OFFSET 1), null)
AS "SecondHighestSalary";
或者:
SELECT MAX(salary) AS 'SecondHighestSalary'
FROM Employee
WHERE salary < (SELECT MAX(salary)
FROM Employee);
简化处理:
select (select distinct salary
from Employee
order by salary desc
limit 1 offset 1) as 'SecondHighestSalary';
取巧的做法:
CREATE FUNCTION getNthHighestSalary(N INT) RETURNS INT
DETERMINISTIC
BEGIN
SET N := N - 1;
RETURN (
SELECT DISTINCT salary
FROM Employee
ORDER BY salary DESC
LIMIT 1 OFFSET N);
END;
下面的解法来自: Solution;
一般解法: (子查询, 但是很慢, 相当于$O(N^2)$)
CREATE FUNCTION getNthHighestSalary(N INT) RETURNS INT
BEGIN
RETURN (SELECT DISTINCT e.salary
FROM Employee e
WHERE (SELECT COUNT(DISTINCT salary)
FROM Employee
WHERE salary > e.salary) = N - 1);
END;
最优解法, 先取前 N 个然后找最小, 同时记录次数便于最后的 null 判断.
来自: Solution;
CREATE FUNCTION getNthHighestSalary(N INT) RETURNS INT
BEGIN
RETURN (SELECT IF(cnt < N, null, mini)
FROM (SELECT MIN(salary) mini, COUNT(*) cnt
FROM (SELECT DISTINCT salary
FROM Employee
ORDER BY salary DESC
LIMIT N) a) b);
END;
178. 分数排名; (通过计算前面有多少个比自己大的值来实现, 因为是连续排名所以可以这样做)
SELECT a.score,
(SELECT COUNT(DISTINCT b.score)
FROM Scores b
WHERE b.score >= a.score) AS 'rank'
FROM Scores a
ORDER BY score DESC;
如果改成不连续排名呢? 需要用变量
ref;
SELECT score, `rank`
FROM (SELECT score,
@curRank := IF(@prevRank = score, @curRank, @incRank) AS `rank`,
@incRank := @incRank + 1,
@prevRank := score
FROM Scores p,
(SELECT @curRank := 0, @prevRank := NULL, @incRank := 1) r
ORDER BY score DESC) s;
SELECT d.name AS 'Department', e.name AS 'Employee', e.salary AS 'Salary'
FROM Employee e
JOIN Department d ON e.departmentId = d.id
WHERE (e.departmentId, e.salary) IN (SELECT e.departmentId, MAX(e.salary)
FROM Employee e
GROUP BY e.departmentId);
行列互换
这俩都是 easy, 不过第一次做得熟悉套路才能做出来
1795. 每个产品在不同商店的价格;(行变列)
SELECT p.product_id, 'store1' store, p.store1 price
FROM Products p
WHERE p.store1 IS NOT NULL
UNION
SELECT p.product_id, 'store2' store, p.store2 price
FROM Products p
WHERE p.store2 IS NOT NULL
UNION
SELECT p.product_id, 'store3' store, p.store3 price
FROM Products p
WHERE p.store3 IS NOT NULL;
1179. 重新格式化部门表;(列变行)
SELECT id,
SUM(CASE month WHEN 'Jan' THEN revenue END) AS Jan_Revenue,
SUM(CASE month WHEN 'Feb' THEN revenue END) AS Feb_Revenue,
SUM(CASE month WHEN 'Mar' THEN revenue END) AS Mar_Revenue,
SUM(CASE month WHEN 'Apr' THEN revenue END) AS Apr_Revenue,
SUM(CASE month WHEN 'May' THEN revenue END) AS May_Revenue,
SUM(CASE month WHEN 'Jun' THEN revenue END) AS Jun_Revenue,
SUM(CASE month WHEN 'Jul' THEN revenue END) AS Jul_Revenue,
SUM(CASE month WHEN 'Aug' THEN revenue END) AS Aug_Revenue,
SUM(CASE month WHEN 'Sep' THEN revenue END) AS Sep_Revenue,
SUM(CASE month WHEN 'Oct' THEN revenue END) AS Oct_Revenue,
SUM(CASE month WHEN 'Nov' THEN revenue END) AS Nov_Revenue,
SUM(CASE month WHEN 'Dec' THEN revenue END) AS Dec_Revenue
FROM Department
GROUP BY id;
叉积的使用
由没有联结条件的表关系返回的结果为笛卡儿积。检索出的行的数目将是第一个表中的行数乘 以第二个表中的行数。
SELECT DISTINCT l1.num ConsecutiveNums
FROM Logs l1
CROSS JOIN Logs l2
CROSS JOIN Logs l3
WHERE l1.num = l2.num
AND l2.num = l3.num
AND l1.id = l2.id - 1
AND l2.id = l3.id - 1;
上面的解法只适用于这个题目, 如果连续出现 N 次呢? 可以用
ROW_NUMBER() OVER
函数来操作行号, 实现连续出现任意次的查询 参考;
SELECT DISTINCT Num ConsecutiveNums
FROM (SELECT Id,
Num,
ROW_NUMBER() OVER (ORDER BY Id) -
ROW_NUMBER() OVER (PARTITION BY Num ORDER BY Id) AS tmp
FROM Logs) Sub
GROUP BY Num, tmp
HAVING COUNT(1) >= 3;
1280. 学生们参加各科测试的次数;(这个题在处理上也需要叉积来做)
SELECT stu.student_id,
stu.student_name,
sub.subject_name,
COUNT(e.subject_name) attended_exams
FROM Subjects sub
CROSS JOIN Students stu
LEFT JOIN Examinations e
ON stu.student_id = e.student_id
AND sub.subject_name = e.subject_name
GROUP BY stu.student_name, stu.student_id, sub.subject_name
ORDER BY stu.student_id, sub.subject_name;
聚集函数的综合运用
笨办法:
SELECT ROUND(COUNT(a.player_id) /
(SELECT COUNT(DISTINCT player_id)
FROM Activity), 2) AS 'fraction'
FROM Activity a
JOIN (SELECT a.player_id pid,
MIN(event_date) first
FROM Activity a
GROUP BY a.player_id) tmp ON tmp.pid = a.player_id
WHERE DATEDIFF(event_date, first) = 1;
使用 AVG 函数:
SELECT ROUND(AVG(a.player_id IS NOT NULL), 2) AS 'fraction'
FROM (SELECT player_id, MIN(event_date) AS login
FROM Activity
GROUP BY player_id) tmp
LEFT JOIN Activity a
ON tmp.player_id = a.player_id
AND DATEDIFF(a.event_date, tmp.login) = 1;
一些技巧
- 题目中说了如果没有某些键值, 则使用 null 填充, 这就说明要使用
outer join
. 或者使用IFNULL()
函数.