Sule-Balogun Olanrewaju
Writing SQL with multiple conditions can be an arduous task, especially if you need to make numerous checks.
For example, an if () else if () else <> check case expression handles all SQL conditionals. If the first condition is satisfied, the query stops executing with a return value. The value specified within the else is returned if no condition is satisfied.
In this article, we'll cover:
In programming when you have a given set of conditions, you end up using conditionals ( switch or if else ) to know which block of code to execute when a condition is met.
With SQL, you can do this using the CASE statement. You use the CASE keyword together with the WHEN clause to execute a block of conditional statement code. You use a THEN statement to return the result of the expression. If none of the conditions are met, then you use a final ELSE clause to return a fallback result.
The SQL CASE statement has the following syntax:
CASE WHEN conditional_statement1 THEN result1 . . . . WHEN condition_statementN THEN resultN ELSE result END;
When you use the CASE statement, it has to be followed by a WHEN and THEN the result if the first condition is met. If the first condition is not met it keeps on checking the other conditions until the nth (or final) condition. If that is still not met then the ELSE condition gets executed.
Also, the ELSE part is optional when using the CASE statement. In scenarios where you don't use it, the query result returns NULL.
In this section, we will take a case study of a real-life scenario to help you learn how to solve a SQL challenge that uses the CASE statement.
The challenge is one I encountered on Coderbyte, a platform for practicing coding challenges. It was a bit tough to crack, and I will break down the step-by-step process involved in this article.
The challenge involves coming up with a SQL query to return the employee with the third-highest salary from a table.
You'll need to structure a query to find this employee and return that row. You also have to replace the position of the DivisionID column with the corresponding DivisionName from the table company_divisions. Then you'll need to replace the ManagerID column with the ManagerName if the ID exists in the table and is not NULL.
In this challenge, we need the CASE statement to help achieve the followings:
Here's the expected output:
ID | Name | DivisionName | ManagerName | Salary |
---|---|---|---|---|
222 | Mark Red | Sales | Susan Wall | 86000 |
And here's the data you'll need to solve this challenge:
ID | Name | DivisionID | ManagerID | Salary |
---|---|---|---|---|
356 | Daniel smith | 100 | 133 | 40000 |
122 | Arnold Sully | 101 | null | 60000 |
467 | Lisa Roberts | 100 | null | 80000 |
112 | Mary Dial | 105 | 467 | 65000 |
775 | Dennis Front | 103 | null | 90000 |
111 | Larry Weis | 104 | 35534 | 75000 |
222 | Mark Red | 102 | 133 | 86000 |
577 | Robert Niger | 105 | 12353 | 76000 |
133 | Susan Wall | 105 | 577 | 110000 |
ID | DivisionName |
---|---|
100 | Accounting |
101 | IT |
102 | Sales |
103 | Marketing |
104 | Engineering |
105 | Customer Support |
In this section, we will look at the step-by-step process involved in solving the challenge.
First, you'll need to structure a query to return the third-highest salary. You'll do this by selecting from the company table and ordering by salary (since we're interested in the record with the third-highest salary).
You can do that like this:
SELECT * FROM company ORDER BY salary DESC limit 1 offset 2;
The query returns the employee's row with the third highest salary, as expected.
ID | Name | DivisionID | ManagerID | Salary |
---|---|---|---|---|
222 | Mark Red | 102 | 133 | 86000 |
So what's going on in this query?
SELECT : you use the SELECT command with the asterisk (), also known as a wildcard) to retrieve all columns from the *company table.
ORDER BY : The ORDER BY command orders column(s) in ascending or descending order. SQL orders by ascending (ASC) by default, but we will order the salary column by descending (DESC). This is because we need the desc salary from the highest to the lowest, that is 110,000 - 40,000.
limit : The limit command limits the number of records returned based on the limit's set value. Since we are only interested in just one row, we will set the limit in the query to 1. This ensures that we will get a return value of a single record every time this query gets executed.
offset : Using the offset clause here helps you specify the number of rows to skip before the start of actually returning the row from the query. Offset lets us skip the two highest-paid rows (Susan Wall and Dennis Front) and return the third highest-paid (Mark Red).
Now, you need to modify the query by selecting only the columns you need – ID, Name, ManagerID, DivisionName, and Salary. Then you need to replace the DivisionID column with the corresponding DivisionName from the table company_divisions.
You can do that like this:
SELECT c.ID, c.Name, c.ManagerID, c.salary, cd.DivisionName FROM company as c LEFT JOIN company_divisions as cd ON c.DivisionId = cd.id ORDER BY salary DESC limit 1 offset 2;
Here's the output:
ID | Name | DivisionName | ManagerID | Salary |
---|---|---|---|---|
222 | Mark Red | Sales | 133 | 86000 |
Let's discuss what's going on in the above query:
LEFT JOIN : Since records are returned from the left side (company), we will match them using the LEFT JOIN on the right side (company_divisions) using the company_division.id and company.DivisionID .
If a matching record is found, that is the company's id is also present in company division, then the DivisionName column is populated with the actual value from the left join, in our case (Sales). If there is no record, nothing is returned.
as (alias): The alias used is a temporary name for the table. So rather than company.name with an alias for the company as c, we can define it as c.name. Using aliases helps improve readability.
We will build on the result of the query from Step 2. We'll use the CASE statement we learned to add conditionals for when the ManagerId is not null and to check if the ManagerId also exists.
The first thing we need to do is check if the company.ManagerID is not null and make sure that the ID exist in the table. We will apply the CASE statement here.
CASE WHEN c.ManagerID IS NOT NULL AND c.ManagerID = c.ID
The second part of the CASE statement is to replace the ManagerID column with the ManagerName. Then we'll need to use the THEN block we learnt earlier like this:
CASE WHEN c.ManagerID IS NOT NULL AND c.ManagerID = c.ID THEN Name ELSE 'Susan Wall' END AS 'ManagerName'
Finally, we can now include the CASE block into the already existing code snippet we had from STEP 2. This will look somewhat like this now:
SELECT c.ID, c.Name, c.salary, cd.DivisionName CASE WHEN c.ManagerID IS NOT NULL AND c.ManagerID = c.ID THEN Name ELSE 'Susan Wall' END AS 'ManagerName' FROM company as c LEFT JOIN company_divisions as cd ON c.DivisionId = cd.id ORDER BY salary DESC limit 1 offset 2;
The result of Step 3 is the expected output – the employee with third-highest salary.
ID | Name | DivisionName | ManagerName | Salary |
---|---|---|---|---|
222 | Mark Red | Sales | Susan Wall | 86000 |
In this article, I hope you learned about the CASE statement in SQL and how to approach a real-world problem using CASE.
You also learned other SQL commands such as SELECT, ORDER BY, LIMIT, OFFSET, LEFT JOIN, and ALIAS.