The GROUP BY clause groups rows based on one or more shared attribute values. It allows us to represent rows with identical values as a single row and is the clause largely relied upon for aggregation. Within a SELECT statement (i.e., query) the GROUP BY clause is typically accompanied by an aggregate function in the SELECT clause. Using a GROUP BY clause in conjunction with an aggregate function allows us to compute sums, minimums, maximums, and averages among other things across groups of rows. It is a cornerstone of generating data for reporting purposes.
Review the example data shown below and then let us explore the use of the GROUP BY clause to solve real-world problems.
Example Data
The schemas, corresponding tables, and data used in the example problems can be found at livesql.oracle.com. To more easily follow the examples covered on this page, consider first reading the table descriptions below and reviewing each table’s contents. To review the data maintained in each table, click on the icon next to the table name. To hide the data, click the icon again.
hr.employees
Contains information related to each employee and a hierarchal employee-manager relationship. There is one row per employee.
| employee_id | first_name | last_name | phone_number | hire_date | job_id | salary | commission_pct | manager_id | department_id | |
|---|---|---|---|---|---|---|---|---|---|---|
| 100 | Steven | King | SKING | 515.123.4567 | 17-Jun-03 | AD_PRES | 24000 | – | – | 90 |
| 101 | Neena | Kochhar | NKOCHHAR | 515.123.4568 | 21-Sep-05 | AD_VP | 17000 | – | 100 | 90 |
| 102 | Lex | De Haan | LDEHAAN | 515.123.4569 | 13-Jan-01 | AD_VP | 17000 | – | 100 | 90 |
| 103 | Alexander | Hunold | AHUNOLD | 590.423.4567 | 3-Jan-06 | IT_PROG | 9000 | – | 102 | 60 |
| 104 | Bruce | Ernst | BERNST | 590.423.4568 | 21-May-07 | IT_PROG | 6000 | – | 103 | 60 |
| 105 | David | Austin | DAUSTIN | 590.423.4569 | 25-Jun-05 | IT_PROG | 4800 | – | 103 | 60 |
| 106 | Valli | Pataballa | VPATABAL | 590.423.4560 | 5-Feb-06 | IT_PROG | 4800 | – | 103 | 60 |
| 107 | Diana | Lorentz | DLORENTZ | 590.423.5567 | 7-Feb-07 | IT_PROG | 4200 | – | 103 | 60 |
| 108 | Nancy | Greenberg | NGREENBE | 515.124.4569 | 17-Aug-02 | FI_MGR | 12008 | – | 101 | 100 |
| 109 | Daniel | Faviet | DFAVIET | 515.124.4169 | 16-Aug-02 | FI_ACCOUNT | 9000 | – | 108 | 100 |
| 110 | John | Chen | JCHEN | 515.124.4269 | 28-Sep-05 | FI_ACCOUNT | 8200 | – | 108 | 100 |
| 111 | Ismael | Sciarra | ISCIARRA | 515.124.4369 | 30-Sep-05 | FI_ACCOUNT | 7700 | – | 108 | 100 |
| 112 | Jose Manuel | Urman | JMURMAN | 515.124.4469 | 7-Mar-06 | FI_ACCOUNT | 7800 | – | 108 | 100 |
| 113 | Luis | Popp | LPOPP | 515.124.4567 | 7-Dec-07 | FI_ACCOUNT | 6900 | – | 108 | 100 |
| 114 | Den | Raphaely | DRAPHEAL | 515.127.4561 | 7-Dec-02 | PU_MAN | 11000 | – | 100 | 30 |
| 115 | Alexander | Khoo | AKHOO | 515.127.4562 | 18-May-03 | PU_CLERK | 3100 | – | 114 | 30 |
| 116 | Shelli | Baida | SBAIDA | 515.127.4563 | 24-Dec-05 | PU_CLERK | 2900 | – | 114 | 30 |
| 117 | Sigal | Tobias | STOBIAS | 515.127.4564 | 24-Jul-05 | PU_CLERK | 2800 | – | 114 | 30 |
| 118 | Guy | Himuro | GHIMURO | 515.127.4565 | 15-Nov-06 | PU_CLERK | 2600 | – | 114 | 30 |
| 119 | Karen | Colmenares | KCOLMENA | 515.127.4566 | 10-Aug-07 | PU_CLERK | 2500 | – | 114 | 30 |
| 120 | Matthew | Weiss | MWEISS | 650.123.1234 | 18-Jul-04 | ST_MAN | 8000 | – | 100 | 50 |
| 121 | Adam | Fripp | AFRIPP | 650.123.2234 | 10-Apr-05 | ST_MAN | 8200 | – | 100 | 50 |
| 122 | Payam | Kaufling | PKAUFLIN | 650.123.3234 | 1-May-03 | ST_MAN | 7900 | – | 100 | 50 |
| 123 | Shanta | Vollman | SVOLLMAN | 650.123.4234 | 10-Oct-05 | ST_MAN | 6500 | – | 100 | 50 |
| 124 | Kevin | Mourgos | KMOURGOS | 650.123.5234 | 16-Nov-07 | ST_MAN | 5800 | – | 100 | 50 |
| 125 | Julia | Nayer | JNAYER | 650.124.1214 | 16-Jul-05 | ST_CLERK | 3200 | – | 120 | 50 |
| 126 | Irene | Mikkilineni | IMIKKILI | 650.124.1224 | 28-Sep-06 | ST_CLERK | 2700 | – | 120 | 50 |
| 127 | James | Landry | JLANDRY | 650.124.1334 | 14-Jan-07 | ST_CLERK | 2400 | – | 120 | 50 |
| 128 | Steven | Markle | SMARKLE | 650.124.1434 | 8-Mar-08 | ST_CLERK | 2200 | – | 120 | 50 |
| 129 | Laura | Bissot | LBISSOT | 650.124.5234 | 20-Aug-05 | ST_CLERK | 3300 | – | 121 | 50 |
| 130 | Mozhe | Atkinson | MATKINSO | 650.124.6234 | 30-Oct-05 | ST_CLERK | 2800 | – | 121 | 50 |
| 131 | James | Marlow | JAMRLOW | 650.124.7234 | 16-Feb-05 | ST_CLERK | 2500 | – | 121 | 50 |
| 132 | TJ | Olson | TJOLSON | 650.124.8234 | 10-Apr-07 | ST_CLERK | 2100 | – | 121 | 50 |
| 133 | Jason | Mallin | JMALLIN | 650.127.1934 | 14-Jun-04 | ST_CLERK | 3300 | – | 122 | 50 |
| 134 | Michael | Rogers | MROGERS | 650.127.1834 | 26-Aug-06 | ST_CLERK | 2900 | – | 122 | 50 |
| 135 | Ki | Gee | KGEE | 650.127.1734 | 12-Dec-07 | ST_CLERK | 2400 | – | 122 | 50 |
| 136 | Hazel | Philtanker | HPHILTAN | 650.127.1634 | 6-Feb-08 | ST_CLERK | 2200 | – | 122 | 50 |
| 137 | Renske | Ladwig | RLADWIG | 650.121.1234 | 14-Jul-03 | ST_CLERK | 3600 | – | 123 | 50 |
| 138 | Stephen | Stiles | SSTILES | 650.121.2034 | 26-Oct-05 | ST_CLERK | 3200 | – | 123 | 50 |
| 139 | John | Seo | JSEO | 650.121.2019 | 12-Feb-06 | ST_CLERK | 2700 | – | 123 | 50 |
| 140 | Joshua | Patel | JPATEL | 650.121.1834 | 6-Apr-06 | ST_CLERK | 2500 | – | 123 | 50 |
| 141 | Trenna | Rajs | TRAJS | 650.121.8009 | 17-Oct-03 | ST_CLERK | 3500 | – | 124 | 50 |
| 142 | Curtis | Davies | CDAVIES | 650.121.2994 | 29-Jan-05 | ST_CLERK | 3100 | – | 124 | 50 |
| 143 | Randall | Matos | RMATOS | 650.121.2874 | 15-Mar-06 | ST_CLERK | 2600 | – | 124 | 50 |
| 144 | Peter | Vargas | PVARGAS | 650.121.2004 | 9-Jul-06 | ST_CLERK | 2500 | – | 124 | 50 |
| 145 | John | Russell | JRUSSEL | 011.44.1344.429268 | 1-Oct-04 | SA_MAN | 14000 | 0.4 | 100 | 80 |
| 146 | Karen | Partners | KPARTNER | 011.44.1344.467268 | 5-Jan-05 | SA_MAN | 13500 | 0.3 | 100 | 80 |
| 147 | Alberto | Errazuriz | AERRAZUR | 011.44.1344.429278 | 10-Mar-05 | SA_MAN | 12000 | 0.3 | 100 | 80 |
| 148 | Gerald | Cambrault | GCAMBRAU | 011.44.1344.619268 | 15-Oct-07 | SA_MAN | 11000 | 0.3 | 100 | 80 |
| 149 | Eleni | Zlotkey | EZLOTKEY | 011.44.1344.429018 | 29-Jan-08 | SA_MAN | 10500 | 0.2 | 100 | 80 |
| 150 | Peter | Tucker | PTUCKER | 011.44.1344.129268 | 30-Jan-05 | SA_REP | 10000 | 0.3 | 145 | 80 |
| 151 | David | Bernstein | DBERNSTE | 011.44.1344.345268 | 24-Mar-05 | SA_REP | 9500 | 0.25 | 145 | 80 |
| 152 | Peter | Hall | PHALL | 011.44.1344.478968 | 20-Aug-05 | SA_REP | 9000 | 0.25 | 145 | 80 |
| 153 | Christopher | Olsen | COLSEN | 011.44.1344.498718 | 30-Mar-06 | SA_REP | 8000 | 0.2 | 145 | 80 |
| 154 | Nanette | Cambrault | NCAMBRAU | 011.44.1344.987668 | 9-Dec-06 | SA_REP | 7500 | 0.2 | 145 | 80 |
| 155 | Oliver | Tuvault | OTUVAULT | 011.44.1344.486508 | 23-Nov-07 | SA_REP | 7000 | 0.15 | 145 | 80 |
| 156 | Janette | King | JKING | 011.44.1345.429268 | 30-Jan-04 | SA_REP | 10000 | 0.35 | 146 | 80 |
| 157 | Patrick | Sully | PSULLY | 011.44.1345.929268 | 4-Mar-04 | SA_REP | 9500 | 0.35 | 146 | 80 |
| 158 | Allan | McEwen | AMCEWEN | 011.44.1345.829268 | 1-Aug-04 | SA_REP | 9000 | 0.35 | 146 | 80 |
| 159 | Lindsey | Smith | LSMITH | 011.44.1345.729268 | 10-Mar-05 | SA_REP | 8000 | 0.3 | 146 | 80 |
| 160 | Louise | Doran | LDORAN | 011.44.1345.629268 | 15-Dec-05 | SA_REP | 7500 | 0.3 | 146 | 80 |
| 161 | Sarath | Sewall | SSEWALL | 011.44.1345.529268 | 3-Nov-06 | SA_REP | 7000 | 0.25 | 146 | 80 |
| 162 | Clara | Vishney | CVISHNEY | 011.44.1346.129268 | 11-Nov-05 | SA_REP | 10500 | 0.25 | 147 | 80 |
| 163 | Danielle | Greene | DGREENE | 011.44.1346.229268 | 19-Mar-07 | SA_REP | 9500 | 0.15 | 147 | 80 |
| 164 | Mattea | Marvins | MMARVINS | 011.44.1346.329268 | 24-Jan-08 | SA_REP | 7200 | 0.1 | 147 | 80 |
| 165 | David | Lee | DLEE | 011.44.1346.529268 | 23-Feb-08 | SA_REP | 6800 | 0.1 | 147 | 80 |
| 166 | Sundar | Ande | SANDE | 011.44.1346.629268 | 24-Mar-08 | SA_REP | 6400 | 0.1 | 147 | 80 |
| 167 | Amit | Banda | ABANDA | 011.44.1346.729268 | 21-Apr-08 | SA_REP | 6200 | 0.1 | 147 | 80 |
| 168 | Lisa | Ozer | LOZER | 011.44.1343.929268 | 11-Mar-05 | SA_REP | 11500 | 0.25 | 148 | 80 |
| 169 | Harrison | Bloom | HBLOOM | 011.44.1343.829268 | 23-Mar-06 | SA_REP | 10000 | 0.2 | 148 | 80 |
| 170 | Tayler | Fox | TFOX | 011.44.1343.729268 | 24-Jan-06 | SA_REP | 9600 | 0.2 | 148 | 80 |
| 171 | William | Smith | WSMITH | 011.44.1343.629268 | 23-Feb-07 | SA_REP | 7400 | 0.15 | 148 | 80 |
| 172 | Elizabeth | Bates | EBATES | 011.44.1343.529268 | 24-Mar-07 | SA_REP | 7300 | 0.15 | 148 | 80 |
| 173 | Sundita | Kumar | SKUMAR | 011.44.1343.329268 | 21-Apr-08 | SA_REP | 6100 | 0.1 | 148 | 80 |
| 174 | Ellen | Abel | EABEL | 011.44.1644.429267 | 11-May-04 | SA_REP | 11000 | 0.3 | 149 | 80 |
| 175 | Alyssa | Hutton | AHUTTON | 011.44.1644.429266 | 19-Mar-05 | SA_REP | 8800 | 0.25 | 149 | 80 |
| 176 | Jonathon | Taylor | JTAYLOR | 011.44.1644.429265 | 24-Mar-06 | SA_REP | 8600 | 0.2 | 149 | 80 |
| 177 | Jack | Livingston | JLIVINGS | 011.44.1644.429264 | 23-Apr-06 | SA_REP | 8400 | 0.2 | 149 | 80 |
| 178 | Kimberely | Grant | KGRANT | 011.44.1644.429263 | 24-May-07 | SA_REP | 7000 | 0.15 | 149 | – |
| 179 | Charles | Johnson | CJOHNSON | 011.44.1644.429262 | 4-Jan-08 | SA_REP | 6200 | 0.1 | 149 | 80 |
| 180 | Winston | Taylor | WTAYLOR | 650.507.9876 | 24-Jan-06 | SH_CLERK | 3200 | – | 120 | 50 |
| 181 | Jean | Fleaur | JFLEAUR | 650.507.9877 | 23-Feb-06 | SH_CLERK | 3100 | – | 120 | 50 |
| 182 | Martha | Sullivan | MSULLIVA | 650.507.9878 | 21-Jun-07 | SH_CLERK | 2500 | – | 120 | 50 |
| 183 | Girard | Geoni | GGEONI | 650.507.9879 | 3-Feb-08 | SH_CLERK | 2800 | – | 120 | 50 |
| 184 | Nandita | Sarchand | NSARCHAN | 650.509.1876 | 27-Jan-04 | SH_CLERK | 4200 | – | 121 | 50 |
| 185 | Alexis | Bull | ABULL | 650.509.2876 | 20-Feb-05 | SH_CLERK | 4100 | – | 121 | 50 |
| 186 | Julia | Dellinger | JDELLING | 650.509.3876 | 24-Jun-06 | SH_CLERK | 3400 | – | 121 | 50 |
| 187 | Anthony | Cabrio | ACABRIO | 650.509.4876 | 7-Feb-07 | SH_CLERK | 3000 | – | 121 | 50 |
| 188 | Kelly | Chung | KCHUNG | 650.505.1876 | 14-Jun-05 | SH_CLERK | 3800 | – | 122 | 50 |
| 189 | Jennifer | Dilly | JDILLY | 650.505.2876 | 13-Aug-05 | SH_CLERK | 3600 | – | 122 | 50 |
| 190 | Timothy | Gates | TGATES | 650.505.3876 | 11-Jul-06 | SH_CLERK | 2900 | – | 122 | 50 |
| 191 | Randall | Perkins | RPERKINS | 650.505.4876 | 19-Dec-07 | SH_CLERK | 2500 | – | 122 | 50 |
| 192 | Sarah | Bell | SBELL | 650.501.1876 | 4-Feb-04 | SH_CLERK | 4000 | – | 123 | 50 |
| 193 | Britney | Everett | BEVERETT | 650.501.2876 | 3-Mar-05 | SH_CLERK | 3900 | – | 123 | 50 |
| 194 | Samuel | McCain | SMCCAIN | 650.501.3876 | 1-Jul-06 | SH_CLERK | 3200 | – | 123 | 50 |
| 195 | Vance | Jones | VJONES | 650.501.4876 | 17-Mar-07 | SH_CLERK | 2800 | – | 123 | 50 |
| 196 | Alana | Walsh | AWALSH | 650.507.9811 | 24-Apr-06 | SH_CLERK | 3100 | – | 124 | 50 |
| 197 | Kevin | Feeney | KFEENEY | 650.507.9822 | 23-May-06 | SH_CLERK | 3000 | – | 124 | 50 |
| 198 | Donald | OConnell | DOCONNEL | 650.507.9833 | 21-Jun-07 | SH_CLERK | 2600 | – | 124 | 50 |
| 199 | Douglas | Grant | DGRANT | 650.507.9844 | 13-Jan-08 | SH_CLERK | 2600 | – | 124 | 50 |
| 200 | Jennifer | Whalen | JWHALEN | 515.123.4444 | 17-Sep-03 | AD_ASST | 4400 | – | 101 | 10 |
| 201 | Michael | Hartstein | MHARTSTE | 515.123.5555 | 17-Feb-04 | MK_MAN | 13000 | – | 100 | 20 |
| 202 | Pat | Fay | PFAY | 603.123.6666 | 17-Aug-05 | MK_REP | 6000 | – | 201 | 20 |
| 203 | Susan | Mavris | SMAVRIS | 515.123.7777 | 7-Jun-02 | HR_REP | 6500 | – | 101 | 40 |
| 204 | Hermann | Baer | HBAER | 515.123.8888 | 7-Jun-02 | PR_REP | 10000 | – | 101 | 70 |
| 205 | Shelley | Higgins | SHIGGINS | 515.123.8080 | 7-Jun-02 | AC_MGR | 12008 | – | 101 | 110 |
| 206 | William | Gietz | WGIETZ | 515.123.8181 | 7-Jun-02 | AC_ACCOUNT | 8300 | – | 205 | 110 |
oe.orders
For each order placed by a customer, various attributes are captured. There is one row per order.
| order_id | order_date | order_mode | customer_id | order_status | order_total | sales_rep_id | promotion_id |
|---|---|---|---|---|---|---|---|
| 2458 | 16-AUG-07 03.34.12.234359 PM | direct | 101 | 0 | 78279.6 | 153 | – |
| 2397 | 19-NOV-07 02.41.54.696211 PM | direct | 102 | 1 | 42283.2 | 154 | – |
| 2454 | 02-OCT-07 05.49.34.678340 PM | direct | 103 | 1 | 6653.4 | 154 | – |
| 2354 | 14-JUL-08 06.18.23.234567 PM | direct | 104 | 0 | 46257 | 155 | – |
| 2358 | 08-JAN-08 05.03.12.654278 PM | direct | 105 | 2 | 7826 | 155 | – |
| 2381 | 14-MAY-08 08.59.08.843679 PM | direct | 106 | 3 | 23034.6 | 156 | – |
| 2440 | 31-AUG-07 09.53.06.008765 PM | direct | 107 | 3 | 70576.9 | 156 | – |
| 2357 | 08-JAN-06 08.19.44.123456 PM | direct | 108 | 5 | 59872.4 | 158 | – |
| 2394 | 10-FEB-08 09.22.35.564789 PM | direct | 109 | 5 | 21863 | 158 | – |
| 2435 | 02-SEP-07 11.22.53.134567 PM | direct | 144 | 6 | 62303 | 159 | – |
| 2455 | 20-SEP-07 11.34.11.456789 AM | direct | 145 | 7 | 14087.5 | 160 | – |
| 2379 | 16-MAY-07 02.22.24.234567 AM | direct | 146 | 8 | 17848.2 | 161 | – |
| 2396 | 02-FEB-06 01.34.56.345678 AM | direct | 147 | 8 | 34930 | 161 | – |
| 2406 | 29-JUN-07 04.41.20.098765 AM | direct | 148 | 8 | 2854.2 | 161 | – |
| 2434 | 13-SEP-07 05.49.30.647893 AM | direct | 149 | 8 | 268651.8 | 161 | – |
| 2436 | 02-SEP-07 06.18.04.378034 AM | direct | 116 | 8 | 6394.8 | 161 | – |
| 2446 | 27-JUL-07 07.03.08.302945 AM | direct | 117 | 8 | 103679.3 | 161 | – |
| 2447 | 27-JUL-08 08.59.10.223344 AM | direct | 101 | 8 | 33893.6 | 161 | – |
| 2432 | 14-SEP-07 09.53.40.223345 AM | direct | 102 | 10 | 10523 | 163 | – |
| 2433 | 13-SEP-07 10.19.00.654279 AM | direct | 103 | 10 | 78 | 163 | – |
| 2355 | 26-JAN-06 09.22.51.962632 AM | online | 104 | 8 | 94513.5 | – | – |
| 2356 | 26-JAN-08 09.22.41.934562 AM | online | 105 | 5 | 29473.8 | – | – |
| 2359 | 08-JAN-06 09.34.13.112233 PM | online | 106 | 9 | 5543.1 | – | – |
| 2360 | 14-NOV-07 12.22.31.223344 PM | online | 107 | 4 | 990.4 | – | – |
| 2361 | 13-NOV-07 01.34.21.986210 PM | online | 108 | 8 | 120131.3 | – | – |
| 2362 | 13-NOV-07 02.41.10.619477 PM | online | 109 | 4 | 92829.4 | – | – |
| 2363 | 23-OCT-07 05.49.56.346122 PM | online | 144 | 0 | 10082.3 | – | – |
| 2364 | 28-AUG-07 06.18.45.942399 PM | online | 145 | 4 | 9500 | – | – |
| 2365 | 28-AUG-07 07.03.34.003399 PM | online | 146 | 9 | 27455.3 | – | – |
| 2366 | 28-AUG-07 08.59.23.144778 PM | online | 147 | 5 | 37319.4 | – | – |
| 2367 | 27-JUN-08 09.53.32.335522 PM | online | 148 | 10 | 144054.8 | – | – |
| 2368 | 26-JUN-08 10.19.43.190089 PM | online | 149 | 10 | 60065 | – | – |
| 2369 | 26-JUN-07 11.22.54.009932 PM | online | 116 | 0 | 11097.4 | – | – |
| 2370 | 27-JUN-08 12.22.11.647398 AM | online | 117 | 4 | 126 | – | – |
| 2371 | 16-MAY-07 01.34.56.113356 AM | online | 118 | 6 | 79405.6 | – | – |
| 2372 | 27-FEB-07 12.22.33.356789 AM | online | 119 | 9 | 16447.2 | – | – |
| 2373 | 27-FEB-08 01.34.51.220065 AM | online | 120 | 4 | 416 | – | – |
| 2374 | 27-FEB-08 02.41.45.109654 AM | online | 121 | 0 | 4797 | – | – |
| 2375 | 26-FEB-07 03.49.50.459233 AM | online | 122 | 2 | 103834.4 | – | – |
| 2376 | 07-JUN-07 06.18.08.883310 AM | online | 123 | 6 | 11006.2 | – | – |
| 2377 | 07-JUN-07 07.03.01.001100 AM | online | 141 | 5 | 38017.8 | – | – |
| 2378 | 24-MAY-07 08.59.10.010101 AM | online | 142 | 5 | 25691.3 | – | – |
| 2380 | 16-MAY-07 09.53.02.909090 AM | online | 143 | 3 | 27132.6 | – | – |
| 2382 | 14-MAY-08 10.19.03.828321 AM | online | 144 | 8 | 71173 | – | – |
| 2383 | 12-MAY-08 11.22.30.545103 AM | online | 145 | 8 | 36374.7 | – | – |
| 2384 | 12-MAY-08 12.22.34.525972 PM | online | 146 | 3 | 29249.1 | – | – |
| 2385 | 08-DEC-07 11.34.11.331392 AM | online | 147 | 4 | 295892 | – | – |
| 2386 | 06-DEC-07 12.22.34.225609 PM | online | 148 | 10 | 21116.9 | – | – |
| 2387 | 11-MAR-07 03.34.56.536966 PM | online | 149 | 5 | 52758.9 | – | – |
| 2388 | 04-JUN-07 04.41.12.554435 PM | online | 150 | 4 | 282694.3 | – | – |
| 2389 | 04-JUN-08 05.49.43.546954 PM | online | 151 | 4 | 17620 | – | – |
| 2390 | 18-NOV-07 04.18.50.546851 PM | online | 152 | 9 | 7616.8 | – | – |
| 2391 | 27-FEB-06 05.03.03.828330 PM | direct | 153 | 2 | 48070.6 | 156 | – |
| 2392 | 21-JUL-07 08.59.57.571057 PM | direct | 154 | 9 | 26632 | 161 | – |
| 2393 | 10-FEB-08 07.53.19.528202 PM | direct | 155 | 4 | 23431.9 | 161 | – |
| 2395 | 02-FEB-06 08.19.11.227550 PM | direct | 156 | 3 | 68501 | 163 | – |
| 2398 | 19-NOV-07 09.22.53.224175 PM | direct | 157 | 9 | 7110.3 | 163 | – |
| 2399 | 19-NOV-07 10.22.38.340990 PM | direct | 158 | 0 | 25270.3 | 161 | – |
| 2400 | 10-JUL-07 01.34.29.559387 AM | direct | 159 | 2 | 69286.4 | 161 | – |
| 2401 | 10-JUL-07 02.22.53.554822 AM | direct | 160 | 3 | 969.2 | 163 | – |
| 2402 | 02-JUL-07 03.34.44.665170 AM | direct | 161 | 8 | 600 | 154 | – |
| 2403 | 01-JUL-07 04.49.13.615512 PM | direct | 162 | 0 | 220 | 154 | – |
| 2404 | 01-JUL-07 04.49.13.664085 PM | direct | 163 | 6 | 510 | 158 | – |
| 2405 | 01-JUL-07 04.49.13.678123 PM | direct | 164 | 5 | 1233 | 159 | – |
| 2407 | 29-JUN-07 07.03.21.526005 AM | direct | 165 | 9 | 2519 | 155 | – |
| 2408 | 29-JUN-07 08.59.31.333617 AM | direct | 166 | 1 | 309 | 158 | – |
| 2409 | 29-JUN-07 09.53.41.984501 AM | direct | 167 | 2 | 48 | 154 | – |
| 2410 | 24-MAY-08 10.19.51.985501 AM | direct | 168 | 6 | 45175 | 156 | – |
| 2411 | 24-MAY-07 11.22.10.548639 AM | direct | 169 | 8 | 15760.5 | 156 | – |
| 2412 | 29-MAR-06 10.22.09.509801 AM | direct | 170 | 9 | 66816 | 158 | – |
| 2413 | 29-MAR-08 01.34.04.525934 PM | direct | 101 | 5 | 48552 | 161 | – |
| 2414 | 29-MAR-07 02.22.40.536996 PM | direct | 102 | 8 | 10794.6 | 153 | – |
| 2415 | 29-MAR-06 01.34.50.545196 PM | direct | 103 | 6 | 310 | 161 | – |
| 2416 | 29-MAR-07 04.41.20.945676 PM | direct | 104 | 6 | 384 | 160 | – |
| 2417 | 20-MAR-07 05.49.10.974352 PM | direct | 105 | 5 | 1926.6 | 163 | – |
| 2418 | 20-MAR-04 04.18.21.862632 PM | direct | 106 | 4 | 5546.6 | 163 | – |
| 2419 | 20-MAR-07 07.03.32.764632 PM | direct | 107 | 3 | 31574 | 160 | – |
| 2420 | 13-MAR-07 08.59.43.666320 PM | direct | 108 | 2 | 29750 | 160 | – |
| 2421 | 12-MAR-07 09.53.54.562432 PM | direct | 109 | 1 | 72836 | – | – |
| 2422 | 16-DEC-07 08.19.55.462332 PM | direct | 144 | 2 | 11188.5 | 153 | – |
| 2423 | 21-NOV-07 10.22.33.362632 AM | direct | 145 | 3 | 10367.7 | 160 | – |
| 2424 | 21-NOV-07 10.22.33.263332 AM | direct | 146 | 4 | 13824 | 153 | – |
| 2425 | 16-NOV-06 11.34.22.162552 PM | direct | 147 | 5 | 1500.8 | 163 | – |
| 2426 | 17-NOV-06 12.22.11.262552 AM | direct | 148 | 6 | 7200 | – | – |
| 2427 | 10-NOV-07 01.34.22.362124 AM | direct | 149 | 7 | 9055 | 163 | – |
| 2428 | 10-NOV-07 02.41.34.463567 AM | direct | 116 | 8 | 14685.8 | – | – |
| 2429 | 10-NOV-07 03.49.25.526321 AM | direct | 117 | 9 | 50125 | 154 | – |
| 2430 | 02-OCT-07 06.18.36.663332 AM | direct | 101 | 8 | 29669.9 | 159 | – |
| 2431 | 14-SEP-06 07.03.04.763452 AM | direct | 102 | 1 | 5610.6 | 163 | – |
| 2437 | 01-SEP-06 08.59.15.826132 AM | direct | 103 | 4 | 13550 | 163 | – |
| 2438 | 01-SEP-07 09.53.26.934626 AM | direct | 104 | 0 | 5451 | 154 | – |
| 2439 | 31-AUG-07 10.19.37.811132 AM | direct | 105 | 1 | 22150.1 | 159 | – |
| 2441 | 01-AUG-08 11.22.48.734526 AM | direct | 106 | 5 | 2075.2 | 160 | – |
| 2442 | 27-JUL-06 12.22.59.662632 PM | direct | 107 | 9 | 52471.9 | 154 | – |
| 2443 | 27-JUL-06 01.34.16.562632 PM | direct | 108 | 0 | 3646 | 154 | – |
| 2444 | 27-JUL-07 02.22.27.462632 PM | direct | 109 | 1 | 77727.2 | 155 | – |
| 2445 | 27-JUL-06 03.34.38.362632 PM | direct | 144 | 8 | 5537.8 | 158 | – |
| 2448 | 18-JUN-07 04.41.49.262632 PM | direct | 145 | 5 | 1388 | 158 | – |
| 2449 | 13-JUN-07 05.49.07.162632 PM | direct | 146 | 6 | 86 | 155 | – |
| 2450 | 11-APR-07 06.18.10.362632 PM | direct | 147 | 3 | 1636 | 159 | – |
| 2451 | 17-DEC-07 05.03.52.562632 PM | direct | 148 | 7 | 10474.6 | 154 | – |
| 2452 | 06-OCT-07 08.59.43.462632 PM | direct | 149 | 5 | 12589 | 159 | – |
| 2453 | 04-OCT-07 09.53.34.362632 PM | direct | 116 | 0 | 129 | 153 | – |
| 2456 | 07-NOV-06 07.53.25.989889 PM | direct | 117 | 0 | 3878.4 | 163 | – |
| 2457 | 31-OCT-07 11.22.16.162632 PM | direct | 118 | 5 | 21586.2 | 159 | – |
Solving Real-world Problems
Using the employees and orders tables above, the GROUP BY clause can be used to answer important questions, such as the following: For each job, what do employee salaries look like? Do employee salaries vary within jobs but across managers, and if so, then how much? What do our customers look like when their order histories are summarized? What do the performances of each sales representative look like? Let us explore each of these questions in turn.
For each job, what do employee salaries look like?
To answer this question, we need to be able to compute various measures for the groups of employees holding each job. The total, average, minimum, and maximum salary paid out for each job is likely a good start. Additionally, the number of employees holding a specific job and the job’s salary range is useful in instances where the average salary is skewed or may be misleading without the associated number of employee salaries it was computed across. The code below can be used to generate an output containing these measures for each job. Take a moment to review the code and corresponding output. An explanation on how the code works follows the output.
SELECT
hr.employees.job_id,
SUM(hr.employees.salary) AS total_salary,
COUNT(hr.employees.employee_id) AS employee_count,
AVG(hr.employees.salary) AS average_salary,
MAX(hr.employees.salary) AS maximum_salary,
MIN(hr.employees.salary) AS minimum_salary,
MAX(hr.employees.salary) - MIN(hr.employees.salary) AS salary_range
FROM
hr.employees
GROUP BY
hr.employees.job_id;
| job_id | total_salary | employee_count | average_salary | maximum_salary | minimum_salary | salary_range |
|---|---|---|---|---|---|---|
| AD_VP | 34000 | 2 | 17000 | 17000 | 17000 | 0 |
| FI_ACCOUNT | 39600 | 5 | 7920 | 9000 | 6900 | 2100 |
| PU_CLERK | 13900 | 5 | 2780 | 3100 | 2500 | 600 |
| SH_CLERK | 64300 | 20 | 3215 | 4200 | 2500 | 1700 |
| HR_REP | 6500 | 1 | 6500 | 6500 | 6500 | 0 |
| PU_MAN | 11000 | 1 | 11000 | 11000 | 11000 | 0 |
| AC_MGR | 12008 | 1 | 12008 | 12008 | 12008 | 0 |
| ST_CLERK | 55700 | 20 | 2785 | 3600 | 2100 | 1500 |
| AD_ASST | 4400 | 1 | 4400 | 4400 | 4400 | 0 |
| IT_PROG | 28800 | 5 | 5760 | 9000 | 4200 | 4800 |
| SA_MAN | 61000 | 5 | 12200 | 14000 | 10500 | 3500 |
| AC_ACCOUNT | 8300 | 1 | 8300 | 8300 | 8300 | 0 |
| FI_MGR | 12008 | 1 | 12008 | 12008 | 12008 | 0 |
| ST_MAN | 36400 | 5 | 7280 | 8200 | 5800 | 2400 |
| AD_PRES | 24000 | 1 | 24000 | 24000 | 24000 | 0 |
| MK_MAN | 13000 | 1 | 13000 | 13000 | 13000 | 0 |
| SA_REP | 250500 | 30 | 8350 | 11500 | 6100 | 5400 |
| MK_REP | 6000 | 1 | 6000 | 6000 | 6000 | 0 |
| PR_REP | 10000 | 1 | 10000 | 10000 | 10000 | 0 |
Let us break down the steps involved in generating the output above by starting with a few facts regarding the employees table. The employees table contains 107 rows with each row representing a single employee. Across the 107 employees, nineteen unique jobs are held. The query above groups the employees based on their respective job ID values. The measures are subsequently computed within each group. The values returned for each group are represented within a single row along with the job ID value defining the group. To reinforce this idea, let us walk through the process manually for the group of IT programmers.
Based on the employees table shown at the beginning of this page, there are five employees with a job ID value of “IT PROG.” The ID values of those employees are 103, 104, 105, 106, and 107 and the salary values are $9,000, $6,000, $4,800, $4,800, and $4,200, respectively. Computing the total salary amount paid out within this group of employees results in a total salary value of $28,880. As expected, this value matches the value observed in the output for the total salary measure. If we computed each measure for each group of employees holding the same job, then we would arrive at the same results shown in the output above.
Before moving onto the next example, consider jotting down a few comments addressing the question, “What do employee salaries look like?” A few of my own can be found in the comments section below and will be included in each of the example problems that follow. Click the icon to see my comments.
Comments
– Ten of the nineteen jobs are occupied by only one employee, resulting in less than enlightening measure values.
– The largest salary range exists among sales representatives while the smallest exists among purchasing clerks.
– The highest paid employee, the president, earns greater than five times the salary of the lowest paid employee working as an administrative assistant.
Do employee salaries vary within jobs but across managers, and if so, then how much?
To tackle this question, a similar approach to that seen above can be taken. The same measures require computing; however, a different level of detail is required. We need to compute the values for each job ID and manager ID combination. Each of those combinations are the groups and those groups are made of the employees holding a specific job and reporting to a specific manager. For example, the financial accountants with employee ID values of 109, 110, 111, 112, and 113 reporting to the manager with an ID value of 108 are a single group. Each of the measures are computed using these five employees’ salaries. Their values are subsequently represented within a single row along with the job ID value and manager ID value defining the group (FI_ACCOUNT and 108, respectively).
SELECT
hr.employees.job_id,
hr.employees.manager_id,
SUM(hr.employees.salary) AS total_salary,
COUNT(hr.employees.employee_id) AS employee_count,
AVG(hr.employees.salary) AS average_salary,
MAX(hr.employees.salary) AS maximum_salary,
MIN(hr.employees.salary) AS minimum_salary,
MAX(hr.employees.salary) - MIN(hr.employees.salary) AS salary_range
FROM
hr.employees
GROUP BY
hr.employees.job_id,
hr.employees.manager_id;
| job_id | manager_id | total_salary | employee_count | average_salary | maximum_salary | minimum_salary | salary_range |
|---|---|---|---|---|---|---|---|
| IT_PROG | 103 | 19800 | 4 | 4950 | 6000 | 4200 | 1800 |
| IT_PROG | 102 | 9000 | 1 | 9000 | 9000 | 9000 | 0 |
| SA_REP | 149 | 50000 | 6 | 8333.333333333333333333333333333333333333 | 11000 | 6200 | 4800 |
| SA_REP | 147 | 46600 | 6 | 7766.666666666666666666666666666666666667 | 10500 | 6200 | 4300 |
| SA_REP | 148 | 51900 | 6 | 8650 | 11500 | 6100 | 5400 |
| SA_REP | 145 | 51000 | 6 | 8500 | 10000 | 7000 | 3000 |
| SA_REP | 146 | 51000 | 6 | 8500 | 10000 | 7000 | 3000 |
| SH_CLERK | 121 | 14700 | 4 | 3675 | 4200 | 3000 | 1200 |
| SH_CLERK | 122 | 12800 | 4 | 3200 | 3800 | 2500 | 1300 |
| SH_CLERK | 120 | 11600 | 4 | 2900 | 3200 | 2500 | 700 |
| SH_CLERK | 123 | 13900 | 4 | 3475 | 4000 | 2800 | 1200 |
| SH_CLERK | 124 | 11300 | 4 | 2825 | 3100 | 2600 | 500 |
| ST_CLERK | 123 | 12000 | 4 | 3000 | 3600 | 2500 | 1100 |
| ST_CLERK | 124 | 11700 | 4 | 2925 | 3500 | 2500 | 1000 |
| ST_CLERK | 120 | 10500 | 4 | 2625 | 3200 | 2200 | 1000 |
| ST_CLERK | 122 | 10800 | 4 | 2700 | 3300 | 2200 | 1100 |
| ST_CLERK | 121 | 10700 | 4 | 2675 | 3300 | 2100 | 1200 |
| AC_ACCOUNT | 205 | 8300 | 1 | 8300 | 8300 | 8300 | 0 |
| AC_MGR | 101 | 12008 | 1 | 12008 | 12008 | 12008 | 0 |
| AD_ASST | 101 | 4400 | 1 | 4400 | 4400 | 4400 | 0 |
| AD_PRES | – | 24000 | 1 | 24000 | 24000 | 24000 | 0 |
| AD_VP | 100 | 34000 | 2 | 17000 | 17000 | 17000 | 0 |
| FI_ACCOUNT | 108 | 39600 | 5 | 7920 | 9000 | 6900 | 2100 |
| FI_MGR | 101 | 12008 | 1 | 12008 | 12008 | 12008 | 0 |
| HR_REP | 101 | 6500 | 1 | 6500 | 6500 | 6500 | 0 |
| MK_MAN | 100 | 13000 | 1 | 13000 | 13000 | 13000 | 0 |
| MK_REP | 201 | 6000 | 1 | 6000 | 6000 | 6000 | 0 |
| PR_REP | 101 | 10000 | 1 | 10000 | 10000 | 10000 | 0 |
| PU_CLERK | 114 | 13900 | 5 | 2780 | 3100 | 2500 | 600 |
| PU_MAN | 100 | 11000 | 1 | 11000 | 11000 | 11000 | 0 |
| SA_MAN | 100 | 61000 | 5 | 12200 | 14000 | 10500 | 3500 |
| ST_MAN | 100 | 36400 | 5 | 7280 | 8200 | 5800 | 2400 |
Comments
– For most jobs, all employees holding that job report to the same manager.
– Of the four jobs where all employees holding the specific job do not report to the same manager, perhaps the most notable variation exists for programmers. The single programmer working for the manager with an ID value of 103 earns a salary close to two times the average programmer working for the manager with an ID value of 102 and 50 percent ($3000) more than the next highest paid programmer.
What do our customers look like when their order histories are summarized?
Understanding past customer purchasing behavior is beneficial to being able to potentially predict the purchasing behavior of customers in the future. Describing each customer’s behavior, at a high-level, can be a good start. The code below allows us to do that by answering the following key questions:
– How much has a customer spent in total across all their orders?
– How many orders has the customer placed?
– On average, how much did a customer spend per order?
– What was the customer’s largest and smallest amount spent in a single order?
– What is the difference, in terms of amount spent, between the customer’s largest and smallest order?
– When did the customer place their first order?
– When did the customer place their most recent order?
SELECT
oe.orders.customer_id,
SUM(oe.orders.order_total) AS total_order_total,
COUNT(oe.orders.order_id) AS order_count,
AVG(oe.orders.order_total) AS average_order_total,
MAX(oe.orders.order_total) AS maximum_order_total,
MIN(oe.orders.order_total) AS minimum_order_total,
MAX(oe.orders.order_total) - MIN(oe.orders.order_total) AS order_total_range,
MAX(oe.orders.order_date) AS latest_order_date,
MIN(oe.orders.order_date) AS first_order_date
FROM
oe.orders
GROUP BY
oe.orders.customer_id;
| customer_id | total_order_total | order_count | average_order_total | maximum_order_total | minimum_order_total | order_total_range | first_order_date | latest_order_date |
|---|---|---|---|---|---|---|---|---|
| 107 | 155613.2 | 4 | 38903.3 | 70576.9 | 990.4 | 69586.5 | 27-JUL-06 12.22.59.662632 PM | 14-NOV-07 12.22.31.223344 PM |
| 108 | 213399.7 | 4 | 53349.925 | 120131.3 | 3646 | 116485.3 | 08-JAN-06 08.19.44.123456 PM | 13-NOV-07 01.34.21.986210 PM |
| 158 | 25270.3 | 1 | 25270.3 | 25270.3 | 25270.3 | 0 | 19-NOV-07 10.22.38.340990 PM | 19-NOV-07 10.22.38.340990 PM |
| 161 | 600 | 1 | 600 | 600 | 600 | 0 | 02-JUL-07 03.34.44.665170 AM | 02-JUL-07 03.34.44.665170 AM |
| 166 | 309 | 1 | 309 | 309 | 309 | 0 | 29-JUN-07 08.59.31.333617 AM | 29-JUN-07 08.59.31.333617 AM |
| 105 | 61376.5 | 4 | 15344.125 | 29473.8 | 1926.6 | 27547.2 | 20-MAR-07 05.49.10.974352 PM | 26-JAN-08 09.22.41.934562 AM |
| 109 | 265255.6 | 4 | 66313.9 | 92829.4 | 21863 | 70966.4 | 12-MAR-07 09.53.54.562432 PM | 10-FEB-08 09.22.35.564789 PM |
| 143 | 27132.6 | 1 | 27132.6 | 27132.6 | 27132.6 | 0 | 16-MAY-07 09.53.02.909090 AM | 16-MAY-07 09.53.02.909090 AM |
| 159 | 69286.4 | 1 | 69286.4 | 69286.4 | 69286.4 | 0 | 10-JUL-07 01.34.29.559387 AM | 10-JUL-07 01.34.29.559387 AM |
| 162 | 220 | 1 | 220 | 220 | 220 | 0 | 01-JUL-07 04.49.13.615512 PM | 01-JUL-07 04.49.13.615512 PM |
| 163 | 510 | 1 | 510 | 510 | 510 | 0 | 01-JUL-07 04.49.13.664085 PM | 01-JUL-07 04.49.13.664085 PM |
| 168 | 45175 | 1 | 45175 | 45175 | 45175 | 0 | 24-MAY-08 10.19.51.985501 AM | 24-MAY-08 10.19.51.985501 AM |
| 104 | 146605.5 | 4 | 36651.375 | 94513.5 | 384 | 94129.5 | 26-JAN-06 09.22.51.962632 AM | 14-JUL-08 06.18.23.234567 PM |
| 118 | 100991.8 | 2 | 50495.9 | 79405.6 | 21586.2 | 57819.4 | 16-MAY-07 01.34.56.113356 AM | 31-OCT-07 11.22.16.162632 PM |
| 145 | 71717.9 | 5 | 14343.58 | 36374.7 | 1388 | 34986.7 | 18-JUN-07 04.41.49.262632 PM | 12-MAY-08 11.22.30.545103 AM |
| 121 | 4797 | 1 | 4797 | 4797 | 4797 | 0 | 27-FEB-08 02.41.45.109654 AM | 27-FEB-08 02.41.45.109654 AM |
| 141 | 38017.8 | 1 | 38017.8 | 38017.8 | 38017.8 | 0 | 07-JUN-07 07.03.01.001100 AM | 07-JUN-07 07.03.01.001100 AM |
| 155 | 23431.9 | 1 | 23431.9 | 23431.9 | 23431.9 | 0 | 10-FEB-08 07.53.19.528202 PM | 10-FEB-08 07.53.19.528202 PM |
| 101 | 190395.1 | 4 | 47598.775 | 78279.6 | 29669.9 | 48609.7 | 16-AUG-07 03.34.12.234359 PM | 27-JUL-08 08.59.10.223344 AM |
| 103 | 20591.4 | 4 | 5147.85 | 13550 | 78 | 13472 | 29-MAR-06 01.34.50.545196 PM | 02-OCT-07 05.49.34.678340 PM |
| 116 | 32307 | 4 | 8076.75 | 14685.8 | 129 | 14556.8 | 26-JUN-07 11.22.54.009932 PM | 10-NOV-07 02.41.34.463567 AM |
| 120 | 416 | 1 | 416 | 416 | 416 | 0 | 27-FEB-08 01.34.51.220065 AM | 27-FEB-08 01.34.51.220065 AM |
| 142 | 25691.3 | 1 | 25691.3 | 25691.3 | 25691.3 | 0 | 24-MAY-07 08.59.10.010101 AM | 24-MAY-07 08.59.10.010101 AM |
| 151 | 17620 | 1 | 17620 | 17620 | 17620 | 0 | 04-JUN-08 05.49.43.546954 PM | 04-JUN-08 05.49.43.546954 PM |
| 156 | 68501 | 1 | 68501 | 68501 | 68501 | 0 | 02-FEB-06 08.19.11.227550 PM | 02-FEB-06 08.19.11.227550 PM |
| 157 | 7110.3 | 1 | 7110.3 | 7110.3 | 7110.3 | 0 | 19-NOV-07 09.22.53.224175 PM | 19-NOV-07 09.22.53.224175 PM |
| 169 | 15760.5 | 1 | 15760.5 | 15760.5 | 15760.5 | 0 | 24-MAY-07 11.22.10.548639 AM | 24-MAY-07 11.22.10.548639 AM |
| 146 | 88462.6 | 5 | 17692.52 | 29249.1 | 86 | 29163.1 | 16-MAY-07 02.22.24.234567 AM | 12-MAY-08 12.22.34.525972 PM |
| 148 | 185700.5 | 5 | 37140.1 | 144054.8 | 2854.2 | 141200.6 | 17-NOV-06 12.22.11.262552 AM | 27-JUN-08 09.53.32.335522 PM |
| 149 | 403119.7 | 5 | 80623.94 | 268651.8 | 9055 | 259596.8 | 11-MAR-07 03.34.56.536966 PM | 26-JUN-08 10.19.43.190089 PM |
| 122 | 103834.4 | 1 | 103834.4 | 103834.4 | 103834.4 | 0 | 26-FEB-07 03.49.50.459233 AM | 26-FEB-07 03.49.50.459233 AM |
| 152 | 7616.8 | 1 | 7616.8 | 7616.8 | 7616.8 | 0 | 18-NOV-07 04.18.50.546851 PM | 18-NOV-07 04.18.50.546851 PM |
| 170 | 66816 | 1 | 66816 | 66816 | 66816 | 0 | 29-MAR-06 10.22.09.509801 AM | 29-MAR-06 10.22.09.509801 AM |
| 144 | 160284.6 | 5 | 32056.92 | 71173 | 5537.8 | 65635.2 | 27-JUL-06 03.34.38.362632 PM | 14-MAY-08 10.19.03.828321 AM |
| 119 | 16447.2 | 1 | 16447.2 | 16447.2 | 16447.2 | 0 | 27-FEB-07 12.22.33.356789 AM | 27-FEB-07 12.22.33.356789 AM |
| 153 | 48070.6 | 1 | 48070.6 | 48070.6 | 48070.6 | 0 | 27-FEB-06 05.03.03.828330 PM | 27-FEB-06 05.03.03.828330 PM |
| 164 | 1233 | 1 | 1233 | 1233 | 1233 | 0 | 01-JUL-07 04.49.13.678123 PM | 01-JUL-07 04.49.13.678123 PM |
| 167 | 48 | 1 | 48 | 48 | 48 | 0 | 29-JUN-07 09.53.41.984501 AM | 29-JUN-07 09.53.41.984501 AM |
| 102 | 69211.4 | 4 | 17302.85 | 42283.2 | 5610.6 | 36672.6 | 14-SEP-06 07.03.04.763452 AM | 19-NOV-07 02.41.54.696211 PM |
| 106 | 36199.5 | 4 | 9049.875 | 23034.6 | 2075.2 | 20959.4 | 20-MAR-04 04.18.21.862632 PM | 01-AUG-08 11.22.48.734526 AM |
| 147 | 371278.2 | 5 | 74255.64 | 295892 | 1500.8 | 294391.2 | 02-FEB-06 01.34.56.345678 AM | 08-DEC-07 11.34.11.331392 AM |
| 117 | 157808.7 | 4 | 39452.175 | 103679.3 | 126 | 103553.3 | 07-NOV-06 07.53.25.989889 PM | 27-JUN-08 12.22.11.647398 AM |
| 123 | 11006.2 | 1 | 11006.2 | 11006.2 | 11006.2 | 0 | 07-JUN-07 06.18.08.883310 AM | 07-JUN-07 06.18.08.883310 AM |
| 150 | 282694.3 | 1 | 282694.3 | 282694.3 | 282694.3 | 0 | 04-JUN-07 04.41.12.554435 PM | 04-JUN-07 04.41.12.554435 PM |
| 154 | 26632 | 1 | 26632 | 26632 | 26632 | 0 | 21-JUL-07 08.59.57.571057 PM | 21-JUL-07 08.59.57.571057 PM |
| 160 | 969.2 | 1 | 969.2 | 969.2 | 969.2 | 0 | 10-JUL-07 02.22.53.554822 AM | 10-JUL-07 02.22.53.554822 AM |
| 165 | 2519 | 1 | 2519 | 2519 | 2519 | 0 | 29-JUN-07 07.03.21.526005 AM | 29-JUN-07 07.03.21.526005 AM |
In the output above, there is one row per customer. In each of those rows, the results of aggregating the customer’s orders is populated. How did this come to be? In the same manner as previous examples, the GROUP BY clause grouped the rows based on the specified attribute(s). In this case, the clause grouped rows based on the customer ID value. Subsequently, several aggregate functions were computed within each group (i.e., customer) and across the respective group’s orders.
Comments
– About sixty-two percent (29/47) of customers have only placed a single order. The other eighteen customers’ order counts range from two to five, with the majority having placed at least 4 orders. If this data was not fictional, it may be beneficial to survey those customers having placed only one order to better understand their experiences and reasons behind not placing a subsequent order.
– The range of total spend among customers is quite large ($403,071.70). The least amount spent by a customer having placed at least one order is $48. The largest amount spent by a single customer is $403,119.70.
– For those customers having placed more than one order, the range of order total values for those customers’ orders exceeds $10,000. For example, the smallest and largest dollar amounts associated with the customer with an ID value of 103 are $13,550 and $78, respectively.
What do the performances of each sales representative look like?
For sales representatives and those managing sales representatives, job performance is often a key factor in determining total compensation. Typically, a sales commission is awarded based on the number of sales a representative has made. Other factors may also contribute to the amount awarded. Using the code below, we can quickly summarize the performances of each sales representative. Notice, the code is largely the same as that used in the previous example. There are two differences. The first difference is the sales representative ID attribute has replaced the customer ID attribute in the SELECT and GROUP BY clauses. This is required to summarize each sales representative’s data. The second difference is the use of a WHERE clause. The WHERE clause ensures only those orders associated with a sales representative are included. This filtering operation takes place prior to the grouping operation. For clarity, see the code’s corresponding output.
SELECT
oe.orders.sales_rep_id,
SUM(oe.orders.order_total) AS total_order_total,
COUNT(oe.orders.order_id) AS order_count,
AVG(oe.orders.order_total) AS average_order_total,
MAX(oe.orders.order_total) AS maximum_order_total,
MIN(oe.orders.order_total) AS minimum_order_total,
MAX(oe.orders.order_total) - MIN(oe.orders.order_total) AS order_total_range,
MAX(oe.orders.order_date) AS latest_order_date,
MIN(oe.orders.order_date) AS first_order_date
FROM
oe.orders
WHERE
oe.orders.sales_rep_id IS NOT NULL
GROUP BY
oe.orders.sales_rep_id;
| sales_rep_id | total_order_total | order_count | average_order_total | maximum_order_total | minimum_order_total | order_total_range | first_order_date | latest_order_date |
|---|---|---|---|---|---|---|---|---|
| 158 | 156296.2 | 7 | 22328.0285714285714285714285714285714286 | 66816 | 309 | 66507 | 10-FEB-08 09.22.35.564789 PM | 08-JAN-06 08.19.44.123456 PM |
| 161 | 661734.5 | 13 | 50902.6538461538461538461538461538461538 | 268651.8 | 310 | 268341.8 | 27-JUL-08 08.59.10.223344 AM | 02-FEB-06 01.34.56.345678 AM |
| 159 | 151167.2 | 7 | 21595.3142857142857142857142857142857143 | 62303 | 1233 | 61070 | 31-OCT-07 11.22.16.162632 PM | 11-APR-07 06.18.10.362632 PM |
| 163 | 128249.5 | 12 | 10687.4583333333333333333333333333333333 | 68501 | 78 | 68423 | 19-NOV-07 09.22.53.224175 PM | 20-MAR-04 04.18.21.862632 PM |
| 155 | 134415.2 | 5 | 26883.04 | 77727.2 | 86 | 77641.2 | 14-JUL-08 06.18.23.234567 PM | 13-JUN-07 05.49.07.162632 PM |
| 156 | 202617.6 | 5 | 40523.52 | 70576.9 | 15760.5 | 54816.4 | 24-MAY-08 10.19.51.985501 AM | 27-FEB-06 05.03.03.828330 PM |
| 153 | 114215.7 | 5 | 22843.14 | 78279.6 | 129 | 78150.6 | 16-DEC-07 08.19.55.462332 PM | 29-MAR-07 02.22.40.536996 PM |
| 154 | 171973.1 | 10 | 17197.31 | 52471.9 | 48 | 52423.9 | 17-DEC-07 05.03.52.562632 PM | 27-JUL-06 12.22.59.662632 PM |
| 160 | 88238.4 | 6 | 14706.4 | 31574 | 384 | 31190 | 01-AUG-08 11.22.48.734526 AM | 13-MAR-07 08.59.43.666320 PM |
I won’t reiterate how the GROUP BY clause works again; however, I do want to touch on how a few of the measures computed and displayed in the output above can be useful. As previously mentioned, the sales commission awarded to each sales representative will likely be based on the number of sales made and perhaps even the total dollar amount associated with the orders. What may not be immediately apparent is the usefulness of the latest order date and first order date values. With a little extra work, the values derived can be used to compute the number of days elapsed between the beginning of a period (e.g., sales year, sales month) and the first sale made by each sales representative. Short durations could indicate a strong start while long durations could indicate a slow start. Likewise, using the latest order date values can tell us whether or not the sales representative has recently made a sale or is struggling to do so. Keep in mind, when using a non-fictional data set a WHERE clause should be used here to filter to the date period of interest unless it is desirable to look at all-time sales performance.
Missing Values
Aggregate functions, with the exception of COUNT(*), ignore NULL values. For example, a SUM computed across the values, 100, 200, 200, NULL, and 300 returns 800. Likewise, an average computed across the same same values will return 200. If rows containing NULLs for the attribute being aggregated are required to be accounted for when computing an average, then consider using the COALESCE function within the average function to replace the missing values with zero prior to the average being computed. For more information on handling missing values, see my page dedicated to this topic, here.
COUNT(*) vs. COUNT(attribute)
When using the COUNT function, it is important to understand the difference in the function’s behavior when the argument passed to the function is an attribute instead of the “*” character. When an attribute is passed to the function, the count returned represents the number of rows with non-null values for the specified attribute while COUNT(*) simply represents the number of rows returned. For clarity, see the example code and corresponding output below.
SELECT
COUNT(*) AS order_count,
COUNT(oe.orders.sales_rep_id) AS assisted_order_count
FROM
oe.orders;
| order_count | assisted_order_count |
|---|---|
| 105 | 70 |
In the output above, COUNT(*) returns the number of rows (i.e., orders) within the orders table. When the sales representative ID attribute is passed into the COUNT function, the number of rows with a sales representative ID value populated is returned (70). This value tells us seventy orders were placed with the assistance of a sales representative.
Common Errors
For users new to SQL, the GROUP BY clause can cause a large amount of frustration. Most of this frustration is rooted in how the SELECT and GROUP BY clauses work together to produce a desired output. When the GROUP BY clause is used, any attributes not treated as constants or are non-aggregates in the SELECT clause must also be specified in the GROUP BY clause. When this is not the case, Oracle database will return the error, ORA-00979: not a GROUP BY expression. To help reinforce why Oracle and other relational database management system products return an error when the SELECT and GROUP BY clauses are not in agreement, let us discuss the code below.
-- Returns ORA-00979: not a GROUP BY expression.
SELECT
hr.employees.job_id,
hr.employees.manager_id,
COUNT(hr.employees.employee_id) AS employee_count
FROM
hr.employees
GROUP BY
hr.employees.job_id;
The GROUP BY clause specified states to group the rows based on the unique job ID values. To reiterate, this specification states to return one row per job ID value. This is problematic because two attributes, job ID and manager ID, are specified in the SELECT clause. The SELECT clause is processed after the GROUP BY clause, so by the time the SELECT clause is being processed for this query, the rows have already been grouped by job ID values. As a result, the level of detail in SELECT clause is not aligned with the GROUP BY clause. This is easy to see for any job ID value associated with more than one manager. With only one row returned for each job ID value, there are not enough rows to also account for each manager ID and job ID combination. The employee count computation also presents an issue in that it is computed at the job ID level. This does not make sense given the attributes in the SELECT clause.
If you are learning the clauses in the typical order and as displayed in the menus, then your next stop is the HAVING clause page. This clause allows users to perform filtering operations on grouped rows. For more information, see the HAVING clause page.
Knowledge Check
If you’d like to confirm your understanding on the GROUP BY clause and other fundamental concepts, consider solving a handful of my practice problems here.