Sql题目总结

 
Category: SQL

写在前面

预备知识

联结(join)

SQL最强大的功能之一就是能在数据检索查询的执行中联结(join) 表。联结是利用SQL的SELECT能执行的最重要的操作,很好地理解联结 及其语法是学习SQL的一个极为重要的组成部分。 – MySQL 必知必会

  • 外键(foreignkey) 外键为某个表中的一列,它包含另一个表 的主键值,定义了两个表之间的关系。
  • 联结是一种机制,用来在一条SELECT 语句中关联表,因此称之为联结。使用特殊的语法,可以联结多个表返 回一组输出,联结在运行时关联表中正确的行。
  • 通过相等的值连接两个表的方法为等值联结(equijoin),它基于两个表之间的 相等测试。这种联结也称为内部联结

典型题目

  1. 1251. 平均售价;(先计算然后作为子查询结果获得最终结果)

排名/最值问题

176. 第二高的薪水;

SELECT ifnull((SELECT DISTINCT salary
               FROM Employee
               ORDER BY salary DESC
               LIMIT 1 OFFSET 1), null)
           AS "SecondHighestSalary";

或者:

SELECT MAX(salary) AS 'SecondHighestSalary'
FROM Employee
WHERE salary < (SELECT MAX(salary)
                FROM Employee);

简化处理:

select (select distinct salary
        from Employee
        order by salary desc
        limit 1 offset 1) as 'SecondHighestSalary';

177. 第N高的薪水;

取巧的做法:

CREATE FUNCTION getNthHighestSalary(N INT) RETURNS INT
    DETERMINISTIC
BEGIN
    SET N := N - 1;
    RETURN (
        SELECT DISTINCT salary
        FROM Employee
        ORDER BY salary DESC
        LIMIT 1 OFFSET N);
END;

下面的解法来自: Solution;

一般解法: (子查询, 但是很慢, 相当于$O(N^2)$)

CREATE FUNCTION getNthHighestSalary(N INT) RETURNS INT
BEGIN
    RETURN (SELECT DISTINCT e.salary
            FROM Employee e
            WHERE (SELECT COUNT(DISTINCT salary)
                   FROM Employee
                   WHERE salary > e.salary) = N - 1);
END;

最优解法, 先取前 N 个然后找最小, 同时记录次数便于最后的 null 判断.

来自: Solution;

CREATE FUNCTION getNthHighestSalary(N INT) RETURNS INT
BEGIN
    RETURN (SELECT IF(cnt < N, null, mini)
            FROM (SELECT MIN(salary) mini, COUNT(*) cnt
                  FROM (SELECT DISTINCT salary
                        FROM Employee
                        ORDER BY salary DESC
                        LIMIT N) a) b);
END;

178. 分数排名; (通过计算前面有多少个比自己大的值来实现, 因为是连续排名所以可以这样做)

SELECT a.score,
       (SELECT COUNT(DISTINCT b.score)
        FROM Scores b
        WHERE b.score >= a.score) AS 'rank'
FROM Scores a
ORDER BY score DESC;

如果改成不连续排名呢? 需要用变量

ref;

SELECT score, `rank`
FROM (SELECT score,
             @curRank := IF(@prevRank = score, @curRank, @incRank) AS `rank`,
             @incRank := @incRank + 1,
             @prevRank := score
      FROM Scores p,
           (SELECT @curRank := 0, @prevRank := NULL, @incRank := 1) r
      ORDER BY score DESC) s;

184. 部门工资最高的员工;

SELECT d.name AS 'Department', e.name AS 'Employee', e.salary AS 'Salary'
FROM Employee e
         JOIN Department d ON e.departmentId = d.id
WHERE (e.departmentId, e.salary) IN (SELECT e.departmentId, MAX(e.salary)
                                     FROM Employee e
                                     GROUP BY e.departmentId);

行列互换

这俩都是 easy, 不过第一次做得熟悉套路才能做出来

1795. 每个产品在不同商店的价格;(行变列)

SELECT p.product_id, 'store1' store, p.store1 price
FROM Products p
WHERE p.store1 IS NOT NULL
UNION
SELECT p.product_id, 'store2' store, p.store2 price
FROM Products p
WHERE p.store2 IS NOT NULL
UNION
SELECT p.product_id, 'store3' store, p.store3 price
FROM Products p
WHERE p.store3 IS NOT NULL;

1179. 重新格式化部门表;(列变行)

SELECT id,
       SUM(CASE month WHEN 'Jan' THEN revenue END) AS Jan_Revenue,
       SUM(CASE month WHEN 'Feb' THEN revenue END) AS Feb_Revenue,
       SUM(CASE month WHEN 'Mar' THEN revenue END) AS Mar_Revenue,
       SUM(CASE month WHEN 'Apr' THEN revenue END) AS Apr_Revenue,
       SUM(CASE month WHEN 'May' THEN revenue END) AS May_Revenue,
       SUM(CASE month WHEN 'Jun' THEN revenue END) AS Jun_Revenue,
       SUM(CASE month WHEN 'Jul' THEN revenue END) AS Jul_Revenue,
       SUM(CASE month WHEN 'Aug' THEN revenue END) AS Aug_Revenue,
       SUM(CASE month WHEN 'Sep' THEN revenue END) AS Sep_Revenue,
       SUM(CASE month WHEN 'Oct' THEN revenue END) AS Oct_Revenue,
       SUM(CASE month WHEN 'Nov' THEN revenue END) AS Nov_Revenue,
       SUM(CASE month WHEN 'Dec' THEN revenue END) AS Dec_Revenue
FROM Department
GROUP BY id;

叉积的使用

由没有联结条件的表关系返回的结果为笛卡儿积。检索出的行的数目将是第一个表中的行数乘 以第二个表中的行数。

180. 连续出现的数字;

SELECT DISTINCT l1.num ConsecutiveNums
FROM Logs l1
         CROSS JOIN Logs l2
         CROSS JOIN Logs l3
WHERE l1.num = l2.num
  AND l2.num = l3.num
  AND l1.id = l2.id - 1
  AND l2.id = l3.id - 1;

上面的解法只适用于这个题目, 如果连续出现 N 次呢? 可以用ROW_NUMBER() OVER 函数来操作行号, 实现连续出现任意次的查询 参考;

SELECT DISTINCT Num ConsecutiveNums
FROM (SELECT Id,
             Num,
             ROW_NUMBER() OVER (ORDER BY Id) -
             ROW_NUMBER() OVER (PARTITION BY Num ORDER BY Id) AS tmp
      FROM Logs) Sub
GROUP BY Num, tmp
HAVING COUNT(1) >= 3;

1280. 学生们参加各科测试的次数;(这个题在处理上也需要叉积来做)

SELECT stu.student_id,
       stu.student_name,
       sub.subject_name,
       COUNT(e.subject_name) attended_exams
FROM Subjects sub
         CROSS JOIN Students stu
         LEFT JOIN Examinations e
                   ON stu.student_id = e.student_id
                       AND sub.subject_name = e.subject_name
GROUP BY stu.student_name, stu.student_id, sub.subject_name
ORDER BY stu.student_id, sub.subject_name;

聚集函数的综合运用

550. 游戏玩法分析 IV;

笨办法:

SELECT ROUND(COUNT(a.player_id) /
             (SELECT COUNT(DISTINCT player_id)
              FROM Activity), 2) AS 'fraction'
FROM Activity a
         JOIN (SELECT a.player_id     pid,
                      MIN(event_date) first
               FROM Activity a
               GROUP BY a.player_id) tmp ON tmp.pid = a.player_id
WHERE DATEDIFF(event_date, first) = 1;

使用 AVG 函数:

SELECT ROUND(AVG(a.player_id IS NOT NULL), 2) AS 'fraction'
FROM (SELECT player_id, MIN(event_date) AS login
      FROM Activity
      GROUP BY player_id) tmp
         LEFT JOIN Activity a
                   ON tmp.player_id = a.player_id
                       AND DATEDIFF(a.event_date, tmp.login) = 1;

一些技巧

  1. 题目中说了如果没有某些键值, 则使用 null 填充, 这就说明要使用 outer join. 或者使用IFNULL() 函数.