Window functions were first made available in structured query language (SQL) in the early 2000s and have since garnered a lot of attention and hype. I’d argue the hype behind window functions is well-deserved given the power they afford the user and the ease in which they can be implemented.
With minimal code, sometimes even a one-liner, aggregates can be computed over groups of rows (e.g., 3-day moving average, average salary for each department), ranking systems can be implemented based on the user’s specification (e.g., rank each student’s performance based on GPA), and calculations referencing those values in rows before or after the current row can be made. Let’s begin looking at the syntax allowing window functions to be implemented so easily.
To use a window function, simply provide the window function name, the OVER clause, and the clauses applicable to your scenario within the OVER clause. The OVER clause and the clauses embedded within it are used to specify how the function will operate on the result set. The clauses commonly found within the OVER clause are the PARTITION BY, ORDER BY, and WINDOW clauses. The PARTITION BY, ORDER BY, and WINDOW clauses are optional; however, depending on the analytic function being used, a default WINDOW clause of RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW is implied and used by default. Regarding defaults, consult your specific RDBMS documentation to be certain. The number of clauses requiring understanding may seem overwhelming at first, but I promise each of them are quite intuitive. Let’s walk through each of them in turn.
PARTITION BY Clause
The PARTITION BY clause specifies how the individual rows within a result set should be grouped, if applicable. The window function specified operates on each of these groups, separately. To illustrate the idea of partitioning, consider the result set below based on the employees table within the human resource (HR) schema at livesql.oracle.com. The data has been partitioned by the job title attribute with the individual rows containing the same job title value belonging to the same partition.
| employee_id | job_title | salary | partition |
|---|---|---|---|
| 100 | President | 24000 | 1 |
| 101 | Administration Vice President | 17000 | 2 |
| 102 | Administration Vice President | 17000 | 2 |
| 145 | Sales Manager | 14000 | 3 |
| 146 | Sales Manager | 13500 | 3 |
| 147 | Sales Manager | 12000 | 3 |
| 148 | Sales Manager | 11000 | 3 |
| 149 | Sales Manager | 10500 | 3 |
| 120 | Stock Manager | 8000 | 4 |
| 121 | Stock Manager | 8200 | 4 |
| 122 | Stock Manager | 7900 | 4 |
| 123 | Stock Manager | 6500 | 4 |
| 124 | Stock Manager | 5800 | 4 |
As previously mentioned, the specified window function will operate on each of these groups. Using the same data above, suppose there is a need to compute the average salary across those employees’ salaries sharing the same job and compare that value to the employee’s salary. The AVG function can be used as a window function and the result set partitioned by the job title attribute to achieve this. See the code and corresponding result set below.
SELECT
hr.employees.employee_id,
hr.jobs.job_title,
hr.employees.salary,
AVG(hr.employees.salary) OVER(
PARTITION BY hr.jobs.job_title
) AS average_job_salary,
hr.employees.salary
- AVG(hr.employees.salary) OVER(
PARTITION BY hr.jobs.job_title
) AS difference
FROM
hr.employees
INNER JOIN
hr.jobs
ON hr.employees.job_id = hr.jobs.job_id
WHERE
hr.employees.job_id IN (
'ST_MAN', 'SA_MAN',
'AD_VP', 'AD_PRES'
);
| employee_id | job_title | salary | average_job_salary | difference |
|---|---|---|---|---|
| 102 | Administration Vice President | 17000 | 17000 | 0 |
| 101 | Administration Vice President | 17000 | 17000 | 0 |
| 100 | President | 24000 | 24000 | 0 |
| 147 | Sales Manager | 12000 | 12200 | -200 |
| 146 | Sales Manager | 13500 | 12200 | 1300 |
| 145 | Sales Manager | 14000 | 12200 | 1800 |
| 148 | Sales Manager | 11000 | 12200 | -1200 |
| 149 | Sales Manager | 10500 | 12200 | -1700 |
| 120 | Stock Manager | 8000 | 7280 | 720 |
| 121 | Stock Manager | 8200 | 7280 | 920 |
| 122 | Stock Manager | 7900 | 7280 | 620 |
| 123 | Stock Manager | 6500 | 7280 | -780 |
| 124 | Stock Manager | 5800 | 7280 | -1480 |
Notice in the result set above, each row contains an average_job_salary value. This value is the average of those employees’ salaries sharing the same job with respect to the current row’s job title value. For example, the average salary for the five sales managers is $12200. This is the result of incorporating a PARTITION BY clause using the job title attribute. Employees are separated into partitions and the average salary is computed within each partition. This value is then reflected within each row belonging to the corresponding partition. If it helps, when reading PARTITION BY clauses, read them as “for each…do this.” In this case, “for each group of employees sharing the same job title value, compute the average salary of those employees.” The derived attribute, difference, goes the extra mile to make it easier for the consumers of the data by computing the difference between each employee’s salary and the average salary value for those employees with the same job.
To reiterate why this functionality is useful, the window function allowed a row-level difference to be computed using values originating from two different levels of detail, the pre-existing employee salary within each row and the average job salary for those employees with the same job title value as the current row’s job title value. It is also worth mentioning, prior to the existence of window functions, a JOIN to a derived table would likely be the go-to strategy for achieving the same thing.
In the previous example, the PARTITION BY clause contains a single expression (i.e., job_title); however, based on the specific task, the partitions may need to be based on multiple expressions or partitioning may not be required at all.
PARTITION BY Clause Not Specified
In the absence of a PARTITION BY clause, it is implied the window function will be computed using all rows (i.e., a single group). To further reinforce the idea of a window encompassing all rows, consider the task of comparing each employee’s salary to the average salary of all employees. This can be achieved with the single line of code shown in the query below.
SELECT
hr.employees.employee_id,
hr.employees.salary,
AVG(hr.employees.salary) OVER() AS average_salary,
hr.employees.salary - AVG(hr.employees.salary) OVER() AS difference
FROM
hr.employees;
Line 3 computes the average salary across all employees (i.e., rows) and is represented within each row. Line 4 computes the difference between each employee’s salary and the average employee salary returned by the window function. For clarity, see the result set below. Note the average salary and the difference is rounded to two decimal places.
| employee_id | salary | average_salary | difference |
|---|---|---|---|
| 100 | 24000 | 6461.83 | 17538.17 |
| 101 | 17000 | 6461.83 | 10538.17 |
| 102 | 17000 | 6461.83 | 10538.17 |
| 103 | 9000 | 6461.83 | 2538.17 |
| 104 | 6000 | 6461.83 | -461.83 |
| 105 | 4800 | 6461.83 | -1661.83 |
| 106 | 4800 | 6461.83 | -1661.83 |
| 107 | 4200 | 6461.83 | -2261.83 |
| 108 | 12008 | 6461.83 | 5546.17 |
| 109 | 9000 | 6461.83 | 2538.17 |
| 110 | 8200 | 6461.83 | 1738.17 |
| 111 | 7700 | 6461.83 | 1238.17 |
| 112 | 7800 | 6461.83 | 1338.17 |
| 113 | 6900 | 6461.83 | 438.17 |
| 114 | 11000 | 6461.83 | 4538.17 |
| 115 | 3100 | 6461.83 | -3361.83 |
| 116 | 2900 | 6461.83 | -3561.83 |
| 117 | 2800 | 6461.83 | -3661.83 |
| 118 | 2600 | 6461.83 | -3861.83 |
| 119 | 2500 | 6461.83 | -3961.83 |
| 120 | 8000 | 6461.83 | 1538.17 |
| 121 | 8200 | 6461.83 | 1738.17 |
| 122 | 7900 | 6461.83 | 1438.17 |
| 123 | 6500 | 6461.83 | 38.17 |
| 124 | 5800 | 6461.83 | -661.83 |
| 125 | 3200 | 6461.83 | -3261.83 |
| 126 | 2700 | 6461.83 | -3761.83 |
| 127 | 2400 | 6461.83 | -4061.83 |
| 128 | 2200 | 6461.83 | -4261.83 |
| 129 | 3300 | 6461.83 | -3161.83 |
| 130 | 2800 | 6461.83 | -3661.83 |
| 131 | 2500 | 6461.83 | -3961.83 |
| 132 | 2100 | 6461.83 | -4361.83 |
| 133 | 3300 | 6461.83 | -3161.83 |
| 134 | 2900 | 6461.83 | -3561.83 |
| 135 | 2400 | 6461.83 | -4061.83 |
| 136 | 2200 | 6461.83 | -4261.83 |
| 137 | 3600 | 6461.83 | -2861.83 |
| 138 | 3200 | 6461.83 | -3261.83 |
| 139 | 2700 | 6461.83 | -3761.83 |
| 140 | 2500 | 6461.83 | -3961.83 |
| 141 | 3500 | 6461.83 | -2961.83 |
| 142 | 3100 | 6461.83 | -3361.83 |
| 143 | 2600 | 6461.83 | -3861.83 |
| 144 | 2500 | 6461.83 | -3961.83 |
| 145 | 14000 | 6461.83 | 7538.17 |
| 146 | 13500 | 6461.83 | 7038.17 |
| 147 | 12000 | 6461.83 | 5538.17 |
| 148 | 11000 | 6461.83 | 4538.17 |
| 149 | 10500 | 6461.83 | 4038.17 |
| 150 | 10000 | 6461.83 | 3538.17 |
| 151 | 9500 | 6461.83 | 3038.17 |
| 152 | 9000 | 6461.83 | 2538.17 |
| 153 | 8000 | 6461.83 | 1538.17 |
| 154 | 7500 | 6461.83 | 1038.17 |
| 155 | 7000 | 6461.83 | 538.17 |
| 156 | 10000 | 6461.83 | 3538.17 |
| 157 | 9500 | 6461.83 | 3038.17 |
| 158 | 9000 | 6461.83 | 2538.17 |
| 159 | 8000 | 6461.83 | 1538.17 |
| 160 | 7500 | 6461.83 | 1038.17 |
| 161 | 7000 | 6461.83 | 538.17 |
| 162 | 10500 | 6461.83 | 4038.17 |
| 163 | 9500 | 6461.83 | 3038.17 |
| 164 | 7200 | 6461.83 | 738.17 |
| 165 | 6800 | 6461.83 | 338.17 |
| 166 | 6400 | 6461.83 | -61.83 |
| 167 | 6200 | 6461.83 | -261.83 |
| 168 | 11500 | 6461.83 | 5038.17 |
| 169 | 10000 | 6461.83 | 3538.17 |
| 170 | 9600 | 6461.83 | 3138.17 |
| 171 | 7400 | 6461.83 | 938.17 |
| 172 | 7300 | 6461.83 | 838.17 |
| 173 | 6100 | 6461.83 | -361.83 |
| 174 | 11000 | 6461.83 | 4538.17 |
| 175 | 8800 | 6461.83 | 2338.17 |
| 176 | 8600 | 6461.83 | 2138.17 |
| 177 | 8400 | 6461.83 | 1938.17 |
| 178 | 7000 | 6461.83 | 538.17 |
| 179 | 6200 | 6461.83 | -261.83 |
| 180 | 3200 | 6461.83 | -3261.83 |
| 181 | 3100 | 6461.83 | -3361.83 |
| 182 | 2500 | 6461.83 | -3961.83 |
| 183 | 2800 | 6461.83 | -3661.83 |
| 184 | 4200 | 6461.83 | -2261.83 |
| 185 | 4100 | 6461.83 | -2361.83 |
| 186 | 3400 | 6461.83 | -3061.83 |
| 187 | 3000 | 6461.83 | -3461.83 |
| 188 | 3800 | 6461.83 | -2661.83 |
| 189 | 3600 | 6461.83 | -2861.83 |
| 190 | 2900 | 6461.83 | -3561.83 |
| 191 | 2500 | 6461.83 | -3961.83 |
| 192 | 4000 | 6461.83 | -2461.83 |
| 193 | 3900 | 6461.83 | -2561.83 |
| 194 | 3200 | 6461.83 | -3261.83 |
| 195 | 2800 | 6461.83 | -3661.83 |
| 196 | 3100 | 6461.83 | -3361.83 |
| 197 | 3000 | 6461.83 | -3461.83 |
| 198 | 2600 | 6461.83 | -3861.83 |
| 199 | 2600 | 6461.83 | -3861.83 |
| 200 | 4400 | 6461.83 | -2061.83 |
| 201 | 13000 | 6461.83 | 6538.17 |
| 202 | 6000 | 6461.83 | -461.83 |
| 203 | 6500 | 6461.83 | 38.17 |
| 204 | 10000 | 6461.83 | 3538.17 |
| 205 | 12008 | 6461.83 | 5546.17 |
| 206 | 8300 | 6461.83 | 1838.17 |
Like the previous example, there is another way to achieve the same result. A CROSS JOIN could be used to cascade the average employee salary down the entire column, allowing the row-level comparison to be made and difference computed. That said, I find the use of the window function more intuitive when reading code I’ve written in the past and when reviewing the code of others.
PARTITION BY Clause With Multiple Expressions
In many instances, the instructions for partitioning or grouping individual rows will require more than a single expression. Building onto the example above, suppose there is a desire to compute the average salary for each job ID and manager ID combination and have those values represented in the corresponding rows. Achieving this is as easy as specifying so in the PARTITION BY clause by separating the two expressions with a comma. See the code and corresponding result set below.
SELECT
hr.employees.employee_id,
hr.employees.manager_id,
hr.employees.job_id,
hr.employees.salary,
AVG(hr.employees.salary) OVER(
PARTITION BY
hr.employees.manager_id,
hr.employees.job_id
) AS average_job_salary,
hr.employees.salary
- AVG(hr.employees.salary) OVER(
PARTITION BY
hr.employees.manager_id,
hr.employees.job_id
) AS difference
FROM
hr.employees
ORDER BY
hr.employees.manager_id,
hr.employees.job_id,
difference DESC;
| employee_id | manager_id | job_id | salary | average_job_salary | difference |
|---|---|---|---|---|---|
| 102 | 100 | AD_VP | 17000 | 17000 | 0 |
| 101 | 100 | AD_VP | 17000 | 17000 | 0 |
| 201 | 100 | MK_MAN | 13000 | 13000 | 0 |
| 114 | 100 | PU_MAN | 11000 | 11000 | 0 |
| 145 | 100 | SA_MAN | 14000 | 12200 | 1800 |
| 146 | 100 | SA_MAN | 13500 | 12200 | 1300 |
| 147 | 100 | SA_MAN | 12000 | 12200 | -200 |
| 148 | 100 | SA_MAN | 11000 | 12200 | -1200 |
| 149 | 100 | SA_MAN | 10500 | 12200 | -1700 |
| 121 | 100 | ST_MAN | 8200 | 7280 | 920 |
| 120 | 100 | ST_MAN | 8000 | 7280 | 720 |
| 122 | 100 | ST_MAN | 7900 | 7280 | 620 |
| 123 | 100 | ST_MAN | 6500 | 7280 | -780 |
| 124 | 100 | ST_MAN | 5800 | 7280 | -1480 |
| 205 | 101 | AC_MGR | 12008 | 12008 | 0 |
| 200 | 101 | AD_ASST | 4400 | 4400 | 0 |
| 108 | 101 | FI_MGR | 12008 | 12008 | 0 |
| 203 | 101 | HR_REP | 6500 | 6500 | 0 |
| 204 | 101 | PR_REP | 10000 | 10000 | 0 |
| 103 | 102 | IT_PROG | 9000 | 9000 | 0 |
| 104 | 103 | IT_PROG | 6000 | 4950 | 1050 |
| 106 | 103 | IT_PROG | 4800 | 4950 | -150 |
| 105 | 103 | IT_PROG | 4800 | 4950 | -150 |
| 107 | 103 | IT_PROG | 4200 | 4950 | -750 |
| 109 | 108 | FI_ACCOUNT | 9000 | 7920 | 1080 |
| 110 | 108 | FI_ACCOUNT | 8200 | 7920 | 280 |
| 112 | 108 | FI_ACCOUNT | 7800 | 7920 | -120 |
| 111 | 108 | FI_ACCOUNT | 7700 | 7920 | -220 |
| 113 | 108 | FI_ACCOUNT | 6900 | 7920 | -1020 |
| 115 | 114 | PU_CLERK | 3100 | 2780 | 320 |
| 116 | 114 | PU_CLERK | 2900 | 2780 | 120 |
| 117 | 114 | PU_CLERK | 2800 | 2780 | 20 |
| 118 | 114 | PU_CLERK | 2600 | 2780 | -180 |
| 119 | 114 | PU_CLERK | 2500 | 2780 | -280 |
| 180 | 120 | SH_CLERK | 3200 | 2900 | 300 |
| 181 | 120 | SH_CLERK | 3100 | 2900 | 200 |
| 183 | 120 | SH_CLERK | 2800 | 2900 | -100 |
| 182 | 120 | SH_CLERK | 2500 | 2900 | -400 |
| 125 | 120 | ST_CLERK | 3200 | 2625 | 575 |
| 126 | 120 | ST_CLERK | 2700 | 2625 | 75 |
| 127 | 120 | ST_CLERK | 2400 | 2625 | -225 |
| 128 | 120 | ST_CLERK | 2200 | 2625 | -425 |
| 184 | 121 | SH_CLERK | 4200 | 3675 | 525 |
| 185 | 121 | SH_CLERK | 4100 | 3675 | 425 |
| 186 | 121 | SH_CLERK | 3400 | 3675 | -275 |
| 187 | 121 | SH_CLERK | 3000 | 3675 | -675 |
| 129 | 121 | ST_CLERK | 3300 | 2675 | 625 |
| 130 | 121 | ST_CLERK | 2800 | 2675 | 125 |
| 131 | 121 | ST_CLERK | 2500 | 2675 | -175 |
| 132 | 121 | ST_CLERK | 2100 | 2675 | -575 |
| 188 | 122 | SH_CLERK | 3800 | 3200 | 600 |
| 189 | 122 | SH_CLERK | 3600 | 3200 | 400 |
| 190 | 122 | SH_CLERK | 2900 | 3200 | -300 |
| 191 | 122 | SH_CLERK | 2500 | 3200 | -700 |
| 133 | 122 | ST_CLERK | 3300 | 2700 | 600 |
| 134 | 122 | ST_CLERK | 2900 | 2700 | 200 |
| 135 | 122 | ST_CLERK | 2400 | 2700 | -300 |
| 136 | 122 | ST_CLERK | 2200 | 2700 | -500 |
| 192 | 123 | SH_CLERK | 4000 | 3475 | 525 |
| 193 | 123 | SH_CLERK | 3900 | 3475 | 425 |
| 194 | 123 | SH_CLERK | 3200 | 3475 | -275 |
| 195 | 123 | SH_CLERK | 2800 | 3475 | -675 |
| 137 | 123 | ST_CLERK | 3600 | 3000 | 600 |
| 138 | 123 | ST_CLERK | 3200 | 3000 | 200 |
| 139 | 123 | ST_CLERK | 2700 | 3000 | -300 |
| 140 | 123 | ST_CLERK | 2500 | 3000 | -500 |
| 196 | 124 | SH_CLERK | 3100 | 2825 | 275 |
| 197 | 124 | SH_CLERK | 3000 | 2825 | 175 |
| 198 | 124 | SH_CLERK | 2600 | 2825 | -225 |
| 199 | 124 | SH_CLERK | 2600 | 2825 | -225 |
| 141 | 124 | ST_CLERK | 3500 | 2925 | 575 |
| 142 | 124 | ST_CLERK | 3100 | 2925 | 175 |
| 143 | 124 | ST_CLERK | 2600 | 2925 | -325 |
| 144 | 124 | ST_CLERK | 2500 | 2925 | -425 |
| 150 | 145 | SA_REP | 10000 | 8500 | 1500 |
| 151 | 145 | SA_REP | 9500 | 8500 | 1000 |
| 152 | 145 | SA_REP | 9000 | 8500 | 500 |
| 153 | 145 | SA_REP | 8000 | 8500 | -500 |
| 154 | 145 | SA_REP | 7500 | 8500 | -1000 |
| 155 | 145 | SA_REP | 7000 | 8500 | -1500 |
| 156 | 146 | SA_REP | 10000 | 8500 | 1500 |
| 157 | 146 | SA_REP | 9500 | 8500 | 1000 |
| 158 | 146 | SA_REP | 9000 | 8500 | 500 |
| 159 | 146 | SA_REP | 8000 | 8500 | -500 |
| 160 | 146 | SA_REP | 7500 | 8500 | -1000 |
| 161 | 146 | SA_REP | 7000 | 8500 | -1500 |
| 162 | 147 | SA_REP | 10500 | 7766.666666666666666666666666666666666667 | 2733.333333333333333333333333333333333333 |
| 163 | 147 | SA_REP | 9500 | 7766.666666666666666666666666666666666667 | 1733.333333333333333333333333333333333333 |
| 164 | 147 | SA_REP | 7200 | 7766.666666666666666666666666666666666667 | -566.666666666666666666666666666666666667 |
| 165 | 147 | SA_REP | 6800 | 7766.666666666666666666666666666666666667 | -966.666666666666666666666666666666666667 |
| 166 | 147 | SA_REP | 6400 | 7766.666666666666666666666666666666666667 | -1366.666666666666666666666666666666666667 |
| 167 | 147 | SA_REP | 6200 | 7766.666666666666666666666666666666666667 | -1566.666666666666666666666666666666666667 |
| 168 | 148 | SA_REP | 11500 | 8650 | 2850 |
| 169 | 148 | SA_REP | 10000 | 8650 | 1350 |
| 170 | 148 | SA_REP | 9600 | 8650 | 950 |
| 171 | 148 | SA_REP | 7400 | 8650 | -1250 |
| 172 | 148 | SA_REP | 7300 | 8650 | -1350 |
| 173 | 148 | SA_REP | 6100 | 8650 | -2550 |
| 174 | 149 | SA_REP | 11000 | 8333.333333333333333333333333333333333333 | 2666.666666666666666666666666666666666667 |
| 175 | 149 | SA_REP | 8800 | 8333.333333333333333333333333333333333333 | 466.666666666666666666666666666666666667 |
| 176 | 149 | SA_REP | 8600 | 8333.333333333333333333333333333333333333 | 266.666666666666666666666666666666666667 |
| 177 | 149 | SA_REP | 8400 | 8333.333333333333333333333333333333333333 | 66.666666666666666666666666666666666667 |
| 178 | 149 | SA_REP | 7000 | 8333.333333333333333333333333333333333333 | -1333.333333333333333333333333333333333333 |
| 179 | 149 | SA_REP | 6200 | 8333.333333333333333333333333333333333333 | -2133.333333333333333333333333333333333333 |
| 202 | 201 | MK_REP | 6000 | 6000 | 0 |
| 206 | 205 | AC_ACCOUNT | 8300 | 8300 | 0 |
| 100 | – | AD_PRES | 24000 | 24000 | 0 |
Why might it be useful to perform such a computation? To answer this question, recall the first two examples above. In the second example, the average salary was computed across all employees and returned within each row to permit a row-level difference and a comparison to be made between the employee’s salary and the average salary of all employees. Such a comparison is not likely to be very meaningful given the large variances in salary values across jobs. In the first example, the average salary was computed using those employee salary values aligned to each job title (i.e., partition specified). This is likely to yield more meaningful information given a comparison that is closer to an apples-to-apples comparison. It allows consumers of the data to understand how the average salaries look across job titles and where each employee’s salary is relative to the average salary of employees having the same job. The latest query above returns a result set enabling data consumers to easily understand how salary values differ among employees holding the same job and reporting to the same manager. With little effort, data consumers can also explore how salaries for a particular job vary depending on the manager the employee reports to.
Keep in mind, while an expression was added to the PARTITION BY clause in this example, the complexity or number of expressions specified in the clause should be driven by the task at hand. Additional or fewer expressions will in most cases change the output of the window function.
ORDER BY Clause
When used within the OVER clause, the ORDER BY clause is used to specify the sorting of the rows within each partition. In the previous examples, the sort order of the individual rows within each partition was not specified and is unnecessary for those specific tasks; however, sorting is particularly useful for implementing the concept of RANK and computing moving/rolling aggregates. Before exploring a couple of examples, I feel it is important to point out the nuance of the ORDER BY clause when used within the OVER clause. Using ORDER BY within the OVER clause doesn’t guarantee the result set output by the query will be in the order specified. It is simply the explicit instruction on how to sort the rows within each partition. If it is desirable to order the result set, use the ORDER BY clause outside of the window function as seen in the examples above.
With that nuance mentioned, let’s now explore the example tasks.
Ranking
For the first example, consider the employee data below. There is one row per employee and captured within each row is the corresponding employee’s name and salary.
| employee_id | employee_name | salary |
|---|---|---|
| 100 | Steven King | 24000 |
| 101 | Neena Kochhar | 17000 |
| 102 | Lex De Haan | 17000 |
| 103 | Alexander Hunold | 9000 |
| 104 | Bruce Ernst | 6000 |
| 105 | David Austin | 4800 |
| 106 | Valli Pataballa | 4800 |
| 107 | Diana Lorentz | 4200 |
| 108 | Nancy Greenberg | 12008 |
| 109 | Daniel Faviet | 9000 |
| 110 | John Chen | 8200 |
| 111 | Ismael Sciarra | 7700 |
| 112 | Jose Manuel Urman | 7800 |
| 113 | Luis Popp | 6900 |
| 114 | Den Raphaely | 11000 |
| 115 | Alexander Khoo | 3100 |
| 116 | Shelli Baida | 2900 |
| 117 | Sigal Tobias | 2800 |
| 118 | Guy Himuro | 2600 |
| 119 | Karen Colmenares | 2500 |
| 120 | Matthew Weiss | 8000 |
| 121 | Adam Fripp | 8200 |
| 122 | Payam Kaufling | 7900 |
| 123 | Shanta Vollman | 6500 |
| 124 | Kevin Mourgos | 5800 |
| 125 | Julia Nayer | 3200 |
| 126 | Irene Mikkilineni | 2700 |
| 127 | James Landry | 2400 |
| 128 | Steven Markle | 2200 |
| 129 | Laura Bissot | 3300 |
| 130 | Mozhe Atkinson | 2800 |
| 131 | James Marlow | 2500 |
| 132 | TJ Olson | 2100 |
| 133 | Jason Mallin | 3300 |
| 134 | Michael Rogers | 2900 |
| 135 | Ki Gee | 2400 |
| 136 | Hazel Philtanker | 2200 |
| 137 | Renske Ladwig | 3600 |
| 138 | Stephen Stiles | 3200 |
| 139 | John Seo | 2700 |
| 140 | Joshua Patel | 2500 |
| 141 | Trenna Rajs | 3500 |
| 142 | Curtis Davies | 3100 |
| 143 | Randall Matos | 2600 |
| 144 | Peter Vargas | 2500 |
| 145 | John Russell | 14000 |
| 146 | Karen Partners | 13500 |
| 147 | Alberto Errazuriz | 12000 |
| 148 | Gerald Cambrault | 11000 |
| 149 | Eleni Zlotkey | 10500 |
| 150 | Peter Tucker | 10000 |
| 151 | David Bernstein | 9500 |
| 152 | Peter Hall | 9000 |
| 153 | Christopher Olsen | 8000 |
| 154 | Nanette Cambrault | 7500 |
| 155 | Oliver Tuvault | 7000 |
| 156 | Janette King | 10000 |
| 157 | Patrick Sully | 9500 |
| 158 | Allan McEwen | 9000 |
| 159 | Lindsey Smith | 8000 |
| 160 | Louise Doran | 7500 |
| 161 | Sarath Sewall | 7000 |
| 162 | Clara Vishney | 10500 |
| 163 | Danielle Greene | 9500 |
| 164 | Mattea Marvins | 7200 |
| 165 | David Lee | 6800 |
| 166 | Sundar Ande | 6400 |
| 167 | Amit Banda | 6200 |
| 168 | Lisa Ozer | 11500 |
| 169 | Harrison Bloom | 10000 |
| 170 | Tayler Fox | 9600 |
| 171 | William Smith | 7400 |
| 172 | Elizabeth Bates | 7300 |
| 173 | Sundita Kumar | 6100 |
| 174 | Ellen Abel | 11000 |
| 175 | Alyssa Hutton | 8800 |
| 176 | Jonathon Taylor | 8600 |
| 177 | Jack Livingston | 8400 |
| 178 | Kimberely Grant | 7000 |
| 179 | Charles Johnson | 6200 |
| 180 | Winston Taylor | 3200 |
| 181 | Jean Fleaur | 3100 |
| 182 | Martha Sullivan | 2500 |
| 183 | Girard Geoni | 2800 |
| 184 | Nandita Sarchand | 4200 |
| 185 | Alexis Bull | 4100 |
| 186 | Julia Dellinger | 3400 |
| 187 | Anthony Cabrio | 3000 |
| 188 | Kelly Chung | 3800 |
| 189 | Jennifer Dilly | 3600 |
| 190 | Timothy Gates | 2900 |
| 191 | Randall Perkins | 2500 |
| 192 | Sarah Bell | 4000 |
| 193 | Britney Everett | 3900 |
| 194 | Samuel McCain | 3200 |
| 195 | Vance Jones | 2800 |
| 196 | Alana Walsh | 3100 |
| 197 | Kevin Feeney | 3000 |
| 198 | Donald OConnell | 2600 |
| 199 | Douglas Grant | 2600 |
| 200 | Jennifer Whalen | 4400 |
| 201 | Michael Hartstein | 13000 |
| 202 | Pat Fay | 6000 |
| 203 | Susan Mavris | 6500 |
| 204 | Hermann Baer | 10000 |
| 205 | Shelley Higgins | 12008 |
| 206 | William Gietz | 8300 |
Additionally, imagine there is a need to rank the employees above by their individual salary values in descending order (i.e., greatest to least). A subset of window functions often referred to as ranking functions can be used to fulfill this task. See the code and corresponding result set below.
SELECT
hr.employees.employee_id,
(
hr.employees.first_name
|| ' '
|| hr.employees.last_name
) AS employee_name,
hr.employees.salary,
DENSE_RANK() OVER (
ORDER BY hr.employees.salary DESC
) AS salary_rank
FROM
hr.employees
ORDER BY
hr.employees.salary DESC;
| employee_id | employee_name | salary | salary_rank |
|---|---|---|---|
| 100 | Steven King | 24000 | 1 |
| 101 | Neena Kochhar | 17000 | 2 |
| 102 | Lex De Haan | 17000 | 2 |
| 145 | John Russell | 14000 | 3 |
| 146 | Karen Partners | 13500 | 4 |
| 201 | Michael Hartstein | 13000 | 5 |
| 108 | Nancy Greenberg | 12008 | 6 |
| 205 | Shelley Higgins | 12008 | 6 |
| 147 | Alberto Errazuriz | 12000 | 7 |
| 168 | Lisa Ozer | 11500 | 8 |
| 114 | Den Raphaely | 11000 | 9 |
| 148 | Gerald Cambrault | 11000 | 9 |
| 174 | Ellen Abel | 11000 | 9 |
| 149 | Eleni Zlotkey | 10500 | 10 |
| 162 | Clara Vishney | 10500 | 10 |
| 156 | Janette King | 10000 | 11 |
| 150 | Peter Tucker | 10000 | 11 |
| 204 | Hermann Baer | 10000 | 11 |
| 169 | Harrison Bloom | 10000 | 11 |
| 170 | Tayler Fox | 9600 | 12 |
| 163 | Danielle Greene | 9500 | 13 |
| 151 | David Bernstein | 9500 | 13 |
| 157 | Patrick Sully | 9500 | 13 |
| 152 | Peter Hall | 9000 | 14 |
| 103 | Alexander Hunold | 9000 | 14 |
| 158 | Allan McEwen | 9000 | 14 |
| 109 | Daniel Faviet | 9000 | 14 |
| 175 | Alyssa Hutton | 8800 | 15 |
| 176 | Jonathon Taylor | 8600 | 16 |
| 177 | Jack Livingston | 8400 | 17 |
| 206 | William Gietz | 8300 | 18 |
| 121 | Adam Fripp | 8200 | 19 |
| 110 | John Chen | 8200 | 19 |
| 120 | Matthew Weiss | 8000 | 20 |
| 159 | Lindsey Smith | 8000 | 20 |
| 153 | Christopher Olsen | 8000 | 20 |
| 122 | Payam Kaufling | 7900 | 21 |
| 112 | Jose Manuel Urman | 7800 | 22 |
| 111 | Ismael Sciarra | 7700 | 23 |
| 160 | Louise Doran | 7500 | 24 |
| 154 | Nanette Cambrault | 7500 | 24 |
| 171 | William Smith | 7400 | 25 |
| 172 | Elizabeth Bates | 7300 | 26 |
| 164 | Mattea Marvins | 7200 | 27 |
| 155 | Oliver Tuvault | 7000 | 28 |
| 178 | Kimberely Grant | 7000 | 28 |
| 161 | Sarath Sewall | 7000 | 28 |
| 113 | Luis Popp | 6900 | 29 |
| 165 | David Lee | 6800 | 30 |
| 203 | Susan Mavris | 6500 | 31 |
| 123 | Shanta Vollman | 6500 | 31 |
| 166 | Sundar Ande | 6400 | 32 |
| 179 | Charles Johnson | 6200 | 33 |
| 167 | Amit Banda | 6200 | 33 |
| 173 | Sundita Kumar | 6100 | 34 |
| 202 | Pat Fay | 6000 | 35 |
| 104 | Bruce Ernst | 6000 | 35 |
| 124 | Kevin Mourgos | 5800 | 36 |
| 106 | Valli Pataballa | 4800 | 37 |
| 105 | David Austin | 4800 | 37 |
| 200 | Jennifer Whalen | 4400 | 38 |
| 184 | Nandita Sarchand | 4200 | 39 |
| 107 | Diana Lorentz | 4200 | 39 |
| 185 | Alexis Bull | 4100 | 40 |
| 192 | Sarah Bell | 4000 | 41 |
| 193 | Britney Everett | 3900 | 42 |
| 188 | Kelly Chung | 3800 | 43 |
| 189 | Jennifer Dilly | 3600 | 44 |
| 137 | Renske Ladwig | 3600 | 44 |
| 141 | Trenna Rajs | 3500 | 45 |
| 186 | Julia Dellinger | 3400 | 46 |
| 133 | Jason Mallin | 3300 | 47 |
| 129 | Laura Bissot | 3300 | 47 |
| 125 | Julia Nayer | 3200 | 48 |
| 194 | Samuel McCain | 3200 | 48 |
| 180 | Winston Taylor | 3200 | 48 |
| 138 | Stephen Stiles | 3200 | 48 |
| 115 | Alexander Khoo | 3100 | 49 |
| 196 | Alana Walsh | 3100 | 49 |
| 181 | Jean Fleaur | 3100 | 49 |
| 142 | Curtis Davies | 3100 | 49 |
| 197 | Kevin Feeney | 3000 | 50 |
| 187 | Anthony Cabrio | 3000 | 50 |
| 116 | Shelli Baida | 2900 | 51 |
| 190 | Timothy Gates | 2900 | 51 |
| 134 | Michael Rogers | 2900 | 51 |
| 117 | Sigal Tobias | 2800 | 52 |
| 195 | Vance Jones | 2800 | 52 |
| 183 | Girard Geoni | 2800 | 52 |
| 130 | Mozhe Atkinson | 2800 | 52 |
| 139 | John Seo | 2700 | 53 |
| 126 | Irene Mikkilineni | 2700 | 53 |
| 118 | Guy Himuro | 2600 | 54 |
| 199 | Douglas Grant | 2600 | 54 |
| 198 | Donald OConnell | 2600 | 54 |
| 143 | Randall Matos | 2600 | 54 |
| 119 | Karen Colmenares | 2500 | 55 |
| 131 | James Marlow | 2500 | 55 |
| 191 | Randall Perkins | 2500 | 55 |
| 144 | Peter Vargas | 2500 | 55 |
| 182 | Martha Sullivan | 2500 | 55 |
| 140 | Joshua Patel | 2500 | 55 |
| 135 | Ki Gee | 2400 | 56 |
| 127 | James Landry | 2400 | 56 |
| 136 | Hazel Philtanker | 2200 | 57 |
| 128 | Steven Markle | 2200 | 57 |
| 132 | TJ Olson | 2100 | 58 |
When using ranking functions, the ORDER BY clause within the OVER clause is required. As previously mentioned, this clause is used to specify how to order the rows within each partition. In this example, the PARTITION BY clause is absent, indicating the ranking function will be applied across all employees. The highest paid employee will be assigned a salary rank value of 1, the second highest paid employee a salary rank value of 2, and so on. Because the DENSE_RANK function is being used, ties in salary values are resolved by assigning the same integer value to those rows involved in each specific tie and the row with the next highest salary value being assigned the next consecutive integer value. This example can be taken a step further by slightly changing the ranking requirement. Instead of ranking employees by their individual salary values when compared to all other employees, each employee’s rank can be computed by comparing the employee’s salary against the salaries of those employees holding the same job. This can be achieved by simply including the job title attribute within a PARTITION BY clause. As a result of doing so, each partition consists of employees with the same job and the ranking function is applied to each partition, individually. For clarity, see the code and corresponding result set below.
SELECT
hr.employees.employee_id,
(
hr.employees.first_name
|| ' '
|| hr.employees.last_name
) AS employee_name,
hr.jobs.job_title,
hr.employees.salary,
DENSE_RANK() OVER (
PARTITION BY hr.jobs.job_title
ORDER BY hr.employees.salary DESC
) AS salary_rank
FROM
hr.employees
INNER JOIN
hr.jobs
ON hr.employees.job_id = hr.jobs.job_id
ORDER BY
hr.jobs.job_title,
hr.employees.salary DESC;
| employee_id | employee_name | job_title | salary | salary_rank |
|---|---|---|---|---|
| 109 | Daniel Faviet | Accountant | 9000 | 1 |
| 110 | John Chen | Accountant | 8200 | 2 |
| 112 | Jose Manuel Urman | Accountant | 7800 | 3 |
| 111 | Ismael Sciarra | Accountant | 7700 | 4 |
| 113 | Luis Popp | Accountant | 6900 | 5 |
| 205 | Shelley Higgins | Accounting Manager | 12008 | 1 |
| 200 | Jennifer Whalen | Administration Assistant | 4400 | 1 |
| 102 | Lex De Haan | Administration Vice President | 17000 | 1 |
| 101 | Neena Kochhar | Administration Vice President | 17000 | 1 |
| 108 | Nancy Greenberg | Finance Manager | 12008 | 1 |
| 203 | Susan Mavris | Human Resources Representative | 6500 | 1 |
| 201 | Michael Hartstein | Marketing Manager | 13000 | 1 |
| 202 | Pat Fay | Marketing Representative | 6000 | 1 |
| 100 | Steven King | President | 24000 | 1 |
| 103 | Alexander Hunold | Programmer | 9000 | 1 |
| 104 | Bruce Ernst | Programmer | 6000 | 2 |
| 106 | Valli Pataballa | Programmer | 4800 | 3 |
| 105 | David Austin | Programmer | 4800 | 3 |
| 107 | Diana Lorentz | Programmer | 4200 | 4 |
| 206 | William Gietz | Public Accountant | 8300 | 1 |
| 204 | Hermann Baer | Public Relations Representative | 10000 | 1 |
| 115 | Alexander Khoo | Purchasing Clerk | 3100 | 1 |
| 116 | Shelli Baida | Purchasing Clerk | 2900 | 2 |
| 117 | Sigal Tobias | Purchasing Clerk | 2800 | 3 |
| 118 | Guy Himuro | Purchasing Clerk | 2600 | 4 |
| 119 | Karen Colmenares | Purchasing Clerk | 2500 | 5 |
| 114 | Den Raphaely | Purchasing Manager | 11000 | 1 |
| 145 | John Russell | Sales Manager | 14000 | 1 |
| 146 | Karen Partners | Sales Manager | 13500 | 2 |
| 147 | Alberto Errazuriz | Sales Manager | 12000 | 3 |
| 148 | Gerald Cambrault | Sales Manager | 11000 | 4 |
| 149 | Eleni Zlotkey | Sales Manager | 10500 | 5 |
| 168 | Lisa Ozer | Sales Representative | 11500 | 1 |
| 174 | Ellen Abel | Sales Representative | 11000 | 2 |
| 162 | Clara Vishney | Sales Representative | 10500 | 3 |
| 156 | Janette King | Sales Representative | 10000 | 4 |
| 150 | Peter Tucker | Sales Representative | 10000 | 4 |
| 169 | Harrison Bloom | Sales Representative | 10000 | 4 |
| 170 | Tayler Fox | Sales Representative | 9600 | 5 |
| 151 | David Bernstein | Sales Representative | 9500 | 6 |
| 157 | Patrick Sully | Sales Representative | 9500 | 6 |
| 163 | Danielle Greene | Sales Representative | 9500 | 6 |
| 158 | Allan McEwen | Sales Representative | 9000 | 7 |
| 152 | Peter Hall | Sales Representative | 9000 | 7 |
| 175 | Alyssa Hutton | Sales Representative | 8800 | 8 |
| 176 | Jonathon Taylor | Sales Representative | 8600 | 9 |
| 177 | Jack Livingston | Sales Representative | 8400 | 10 |
| 159 | Lindsey Smith | Sales Representative | 8000 | 11 |
| 153 | Christopher Olsen | Sales Representative | 8000 | 11 |
| 154 | Nanette Cambrault | Sales Representative | 7500 | 12 |
| 160 | Louise Doran | Sales Representative | 7500 | 12 |
| 171 | William Smith | Sales Representative | 7400 | 13 |
| 172 | Elizabeth Bates | Sales Representative | 7300 | 14 |
| 164 | Mattea Marvins | Sales Representative | 7200 | 15 |
| 161 | Sarath Sewall | Sales Representative | 7000 | 16 |
| 155 | Oliver Tuvault | Sales Representative | 7000 | 16 |
| 178 | Kimberely Grant | Sales Representative | 7000 | 16 |
| 165 | David Lee | Sales Representative | 6800 | 17 |
| 166 | Sundar Ande | Sales Representative | 6400 | 18 |
| 167 | Amit Banda | Sales Representative | 6200 | 19 |
| 179 | Charles Johnson | Sales Representative | 6200 | 19 |
| 173 | Sundita Kumar | Sales Representative | 6100 | 20 |
| 184 | Nandita Sarchand | Shipping Clerk | 4200 | 1 |
| 185 | Alexis Bull | Shipping Clerk | 4100 | 2 |
| 192 | Sarah Bell | Shipping Clerk | 4000 | 3 |
| 193 | Britney Everett | Shipping Clerk | 3900 | 4 |
| 188 | Kelly Chung | Shipping Clerk | 3800 | 5 |
| 189 | Jennifer Dilly | Shipping Clerk | 3600 | 6 |
| 186 | Julia Dellinger | Shipping Clerk | 3400 | 7 |
| 180 | Winston Taylor | Shipping Clerk | 3200 | 8 |
| 194 | Samuel McCain | Shipping Clerk | 3200 | 8 |
| 181 | Jean Fleaur | Shipping Clerk | 3100 | 9 |
| 196 | Alana Walsh | Shipping Clerk | 3100 | 9 |
| 187 | Anthony Cabrio | Shipping Clerk | 3000 | 10 |
| 197 | Kevin Feeney | Shipping Clerk | 3000 | 10 |
| 190 | Timothy Gates | Shipping Clerk | 2900 | 11 |
| 183 | Girard Geoni | Shipping Clerk | 2800 | 12 |
| 195 | Vance Jones | Shipping Clerk | 2800 | 12 |
| 199 | Douglas Grant | Shipping Clerk | 2600 | 13 |
| 198 | Donald OConnell | Shipping Clerk | 2600 | 13 |
| 182 | Martha Sullivan | Shipping Clerk | 2500 | 14 |
| 191 | Randall Perkins | Shipping Clerk | 2500 | 14 |
| 137 | Renske Ladwig | Stock Clerk | 3600 | 1 |
| 141 | Trenna Rajs | Stock Clerk | 3500 | 2 |
| 129 | Laura Bissot | Stock Clerk | 3300 | 3 |
| 133 | Jason Mallin | Stock Clerk | 3300 | 3 |
| 125 | Julia Nayer | Stock Clerk | 3200 | 4 |
| 138 | Stephen Stiles | Stock Clerk | 3200 | 4 |
| 142 | Curtis Davies | Stock Clerk | 3100 | 5 |
| 134 | Michael Rogers | Stock Clerk | 2900 | 6 |
| 130 | Mozhe Atkinson | Stock Clerk | 2800 | 7 |
| 126 | Irene Mikkilineni | Stock Clerk | 2700 | 8 |
| 139 | John Seo | Stock Clerk | 2700 | 8 |
| 143 | Randall Matos | Stock Clerk | 2600 | 9 |
| 144 | Peter Vargas | Stock Clerk | 2500 | 10 |
| 140 | Joshua Patel | Stock Clerk | 2500 | 10 |
| 131 | James Marlow | Stock Clerk | 2500 | 10 |
| 135 | Ki Gee | Stock Clerk | 2400 | 11 |
| 127 | James Landry | Stock Clerk | 2400 | 11 |
| 136 | Hazel Philtanker | Stock Clerk | 2200 | 12 |
| 128 | Steven Markle | Stock Clerk | 2200 | 12 |
| 132 | TJ Olson | Stock Clerk | 2100 | 13 |
| 121 | Adam Fripp | Stock Manager | 8200 | 1 |
| 120 | Matthew Weiss | Stock Manager | 8000 | 2 |
| 122 | Payam Kaufling | Stock Manager | 7900 | 3 |
| 123 | Shanta Vollman | Stock Manager | 6500 | 4 |
| 124 | Kevin Mourgos | Stock Manager | 5800 | 5 |
For an in-depth discussion on ranking functions and more involved, real-world examples, see my page dedicated to the topic here.
Aggregates with Window Clauses
The next example explores the concepts of computing moving aggregates and user-defined window clauses. Imagine you’ve been asked to compute a 3-day moving average for sales using the data in the table below. The data contains one row per day and the corresponding daily sales value.
| report_date | sales |
|---|---|
| 01-OCT-23 | 1467 |
| 02-OCT-23 | 1192 |
| 03-OCT-23 | 4663 |
| 04-OCT-23 | 4018 |
| 05-OCT-23 | 4500 |
| 06-OCT-23 | 4711 |
| 07-OCT-23 | 3728 |
| 08-OCT-23 | 3598 |
| 09-OCT-23 | 1980 |
| 10-OCT-23 | 2471 |
| 11-OCT-23 | 2441 |
| 12-OCT-23 | 3855 |
| 13-OCT-23 | 4634 |
| 14-OCT-23 | 3083 |
| 15-OCT-23 | 4561 |
| 16-OCT-23 | 4829 |
| 17-OCT-23 | 4413 |
| 18-OCT-23 | 1166 |
| 19-OCT-23 | 4991 |
| 20-OCT-23 | 1063 |
| 21-OCT-23 | 3821 |
| 22-OCT-23 | 4631 |
| 23-OCT-23 | 2914 |
| 24-OCT-23 | 2358 |
| 25-OCT-23 | 3501 |
| 26-OCT-23 | 3594 |
| 27-OCT-23 | 3260 |
| 28-OCT-23 | 1309 |
The code used to generate the fictional data set above, compute and the 3-day moving average values, and the corresponding result set can be found below. Following the result set is a detailed explanation of the solution.
EXEC DBMS_RANDOM.SEED(42);
WITH daily_report (report_date, daily_order_total) AS (
SELECT
TRUNC(SYSDATE) AS report_date,
FLOOR(DBMS_RANDOM.VALUE(1000, 5001)) AS daily_order_total
FROM
dual
UNION ALL
SELECT
daily_report.report_date - 1,
FLOOR(DBMS_RANDOM.VALUE(1000, 5001))
FROM
daily_report
WHERE
daily_report.report_date > TO_DATE('2023-10-01', 'YYYY-MM-DD')
)
SELECT
daily_report.report_date,
daily_report.daily_order_total,
ROUND(
AVG(daily_report.daily_order_total) OVER (
ORDER BY daily_report.report_date
ROWS BETWEEN 2 PRECEDING AND CURRENT ROW
),
2
) AS three_day_moving_average
FROM
daily_report
ORDER BY
daily_report.report_date;
| report_date | sales | three_day_moving_average |
|---|---|---|
| 01-OCT-23 | 1467 | 1467 |
| 02-OCT-23 | 1192 | 1329.5 |
| 03-OCT-23 | 4663 | 2440.67 |
| 04-OCT-23 | 4018 | 3291 |
| 05-OCT-23 | 4500 | 4393.67 |
| 06-OCT-23 | 4711 | 4409.67 |
| 07-OCT-23 | 3728 | 4313 |
| 08-OCT-23 | 3598 | 4012.33 |
| 09-OCT-23 | 1980 | 3102 |
| 10-OCT-23 | 2471 | 2683 |
| 11-OCT-23 | 2441 | 2297.33 |
| 12-OCT-23 | 3855 | 2922.33 |
| 13-OCT-23 | 4634 | 3643.33 |
| 14-OCT-23 | 3083 | 3857.33 |
| 15-OCT-23 | 4561 | 4092.67 |
| 16-OCT-23 | 4829 | 4157.67 |
| 17-OCT-23 | 4413 | 4601 |
| 18-OCT-23 | 1166 | 3469.33 |
| 19-OCT-23 | 4991 | 3523.33 |
| 20-OCT-23 | 1063 | 2406.67 |
| 21-OCT-23 | 3821 | 3291.67 |
| 22-OCT-23 | 4631 | 3171.67 |
| 23-OCT-23 | 2914 | 3788.67 |
| 24-OCT-23 | 2358 | 3301 |
| 25-OCT-23 | 3501 | 2924.33 |
| 26-OCT-23 | 3594 | 3151 |
| 27-OCT-23 | 3260 | 3451.67 |
| 28-OCT-23 | 1309 | 2721 |
In the code above, a new clause is introduced within the OVER clause often referred to as the window or window clause. This clause allows a user to specify the row(s) or value(s) to be used when calculating the current row’s value. Using the code and corresponding result set above, the 3-day moving average for the first row is the row’s sales value (1467), the value for the second row is the average of the first and second rows’ values ((1467 + 1192)/2), and the value for the third row is the average of the two rows prior and the current row ((1467 + 1192 + 4663)/3). To compute the 3-day moving average value for the fourth row (04-OCT-23), the window must slide down one row. The computation will then consist of computing the average sales value of the second, third, and fourth rows’ sales values. For each subsequent row, the window slides down one row and uses the sales value in the current row and those values within the two rows prior to the current row for the computation. Perhaps not immediately obvious is the importance of the ORDER BY clause within the OVER clause in this example. It plays a crucial role in ensuring the rows are in chronological order and consequently fall into the correct window or 3-day period when performing each row’s computation.
Window Clause Options
Similar to the real-world task incorporated into the example above where each window is comprised of the current row and the two rows preceding the current row, the task of computing a 5-day moving average requires the window to be defined as ROWS BETWEEN 5 PRECEDING AND CURRENT ROW. What is more, the window may even be comprised of rows beyond the current row. In such cases, the FOLLOWING keyword can be used to specify the number of rows following the current row to use in the current row’s computation. The window specification is task dependent! Below is a list of keywords and a brief explanation of how they can be used.
ROWS
Is used to specify a fixed or unbounded number of rows used in each window relative to the current row.
RANGE
Is used to specify the values used in each window relative to the current row’s value. For example, suppose you have data on the price of a stock over time and wish to compute the average stock price for the current row’s stock date value and those stock price values in those rows with stock date values within a range of one day preceding and one day following. In such a case, the window specification could be RANGE BETWEEN INTERVAL ‘1’ DAY PRECEDING AND INTERVAL ‘1’ DAY FOLLOWING.
BETWEEN
Is used to specify the first and last row to be used in each window, if applicable. Some tasks may not require two endpoints, or the endpoint may be implied when not expressed explicitly.
PRECEDING
Is used to specify the number of rows prior to the current row to be used in each row’s computation. In some cases, a task will require each window to be comprised of the current row and all rows prior to the current row. For example, a rolling/cumulative sum will have a window specification of ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW.
FOLLOWING
Is used to specify the number of rows after the current row to be used in each row’s computation. Like the PRECEDING keyword, some tasks will require each window to be comprised of the current row and all rows after the current row. A task requiring the current row, the row preceding the current row, and the row following the current row will have a window specification of ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING. The PARTITION BY Clause Not Specified section above walked through the task of capturing the average salary of all employees within each row. In that example, the OVER clause is not populated and the window of ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING is implied.
For additional discussion and more real-world examples covering aggregates with window clause specifications, see my page dedicated to the topic here.
Closing Thoughts
Evidenced by the handful of real-world problems covered, it’s hard to understate the power of window functions. Minimal effort in terms of lines of code and overall SQL knowledge is required to solve each problem. Keep in mind, these are real, everyday problems and you will likely have to solve some flavor or variation of these problems at some point. When solving problems requiring window functions, remember to let the task drive the choice of window function and its corresponding arguments by first understanding what you are trying to accomplish. Additionally, do not fall into the trap of believing every problem is best solved by window functions.
Additional Information
In the final example, a recursive common table expression (CTE) was used to generate report dates and their corresponding daily order total values. If you’re unfamiliar with the concepts of recursion and/or common table expressions, consider reading my recursion and common table expression pages to learn how to incorporate these concepts into real-world coding solutions.
Knowledge Check
If you’d like to confirm your understanding on the use of window functions and other fundamental concepts, consider solving a handful of my practice problems here.