Foooking can't grok GroupBy in Postgres

avatar
(Edited)

My kneejerk reaction is that GroupBy in Postgres is just broken. This is obviously not the case, instead I just don't understand the problem it is solving.

In the particular case I was trying to solve I had in mind that I want to fetch all rows and just group the content on a said key... Very object centric I guess.

Although I have not found a solution in SQL that particularly matches what I wanted to achieve , and I do think it will just be a subquery where I probably join the same table data after having done the grouping... Not sure really.

I do know exactly what I want it to look like when I get the results

result = {
     'key1': {
       'property1': 'value'
       'property2': 'value'
       'property3': 'value'
     }
     'key2': {
       'property1': 'value'
       'property2': 'value'
       'property3': 'value'
     }
     'key3': {
       'property1': 'value'
       'property2': 'value'
       'property3': 'value'
     }
}

So after looking at what I supposedly wanted to be returned I realise that it is much more something in the domain of an API or possibly NoSQL.

Which fair enough if that is the case.

Actually seeing the result in that form is what made me realise I just don't understand what groupBy is doing and I definitely did not do the appropriate context switch to know that given how Sql will return my data does not align with my case at all.

To try and explain that, this is what Postgres is going to want to return to me no matter what:

id:key1, property1:value, property2:value, property3:value;
id:key2, property1:value, property2:value, property3:value;
id:key3, property1:value, property2:value, property3:value;

So given no matter what fancy aggregates I try etc, it will always return them on a per row basis. So yes I literally asked a stupid question because I wanted it to do the "logic" for me.

For the most part I was hung-up on the fact that I am losing columns when using GroupBy. For example if I want to group on property1 then I can't group on it unless I include the other columns also since I am using a SELECT *

And obviously even selecting specific columns those still need to be in the GroupBy... So nothing gets grouped since they are all in the GroupBy and would just return the data as though I did not group anything.

Obviously y'all are smarter and must be crying in pain at this problem you clearly do not see as a problem, it is just me being stupid.

Which is what I said each time a tutorial added an aggregate to their GroupBy query and either used that as the GroupBy or a specific column but still leaving behind the rest of the info.

So I had to ask: "WTF am I trying to do and why am I obsessed with maintaining the extra data?"

Well, I wanted a tally of all the columns based on property1 , and did the rest matter?

NO

It did not matter, and so I went and I GroupedBy the property1 and added the total field for the count and boom I got what I wanted.

Now sure maybe the other data does matter but frankly I can just pull the keys I grouped on and rather do a second call which is simple and straight forward then merge my results accordingly in the code.

Or I guess some form of a join could be achieved, I am not to sure how heavy a query like that is.

With varying complexity of course depending on the size.

So all in all what I learned from that is that GroupBy still sucks but it mainly sucks because I was thinking of my result set in a different context.



0
0
0.000
5 comments
avatar

You don't use any ORM?

0
0
0.000
avatar

Yup, Eloquent from Laravel which I think is the part that kinda made me insist it can be done the way I want. Since there is the groupBy method for the ORM but also groupBy for result object. The groupBy on the result object does exactly what I showed and keys the results accordingly, where as the groupBy on the ORM or query builder is an interface for the SQL engines methods and in POSTGRES you have to have all columns from the select available in the groupBy + then any aggregates you use can be excluded so counting etc works but if I wanted all the data the best is to use the result and instead groupBy in my code to build the structure I want.

0
0
0.000
avatar

I think I understand, it just seems ugly to re-execute one query after another to recover the missing data

0
0
0.000
avatar

Yes which is why I was pretty determined to get it in one go but for this case specifically I did not really need the extra data. If I did I think I would query the rows as normal and instead of having sql do the computing just run a loop to structure it as I want it , which is actually what I usually do. Guess it depends what I am querying, I will learn better ways to do it as I go I am sure :) !PIZZA

0
0
0.000