now you understand how to fix the error – but do you understand why it is a problem in the first place? Well, you should – because that is the most important thing to understand! So, let’s explain some more about why SQL gives that error shown above .
First off, let’s talk a little bit more about aggregate functions. You probably know what aggregate functions in SQL are – we used one in the example above. In case you forgot, aggregate functions are used to perform a mathematical function on the values inside a given column, which is passed into the aggregate function. Here are some of the commonly used aggregate functions:
AVG() - Returns the average value COUNT() - Returns the number of rows FIRST() - Returns the first value LAST() - Returns the last value MAX() - Returns the largest value MIN() - Returns the smallest value SUM() - Returns the sum
To illustrate why the SQL standard says that a selected column has to be in the group by clause or part of an aggregate function, let’s use another example. Suppose we have some tables called Starbucks_Stores and Starbucks_Employees. In case you don’t already know, Starbucks is a popular coffee shop/cafe in the USA:
Starbucks_Employees | Starbucks_Stores | |||||||||||||||||||||||||||||||||||||||||||||||||||
|
|
Now, given the tables above let’s say that we write some SQL like this:
SELECT count(*) as num_employees, HourlyRate FROM Starbucks_Employees JOIN Starbucks_Stores ON Starbucks_Employees.StoreID = Starbucks_Stores.store_id GROUP BY city
It looks like the SQL above would just return the number of Starbucks employees in each city, along with the HourlyRate – because it will group the employees based on whatever city they work in (thanks to the “group by city” statement).