There Has to Be a Better Way: Sorting Hive-Engine Balances

Hey everyone,

I've been working on improving my Market-Viewr tool, specifically trying to find a more natural way to get a token's richlist. Right now, the process is pretty inefficient.

The core issue is twofold. First, the token balances are returned sorted alphabetically by account name, not by the balance amount. Second, the API has a hard limit of 10,000 results per query. For tokens with a huge number of holders, like SWAP.HIVE or PIZZA, this means you can't simply paginate through the entire list to find the top holders; you're cut off after the first 10k accounts. This is why I had to create a "hacky" workaround that makes at least 26 separate API calls, at least one for each letter of the alphabet (sometimes more for letters like 'S'), just to try and piece the full picture together.


unnamed.png

"There Has to Be a Better Way"

I thought to myself, "There has to be a better way." So, I started digging into the Hive-Engine node codebase to see if I could just add a sort parameter to the API call. The idea was simple, but I quickly ran into the core problem: the token balances are stored in the database as text strings, not numbers. This is done to preserve the decimal precision so they can be used with big math libraries, but it also means a standard database sort won't work correctly (for example, "100" would come before "2").

After reading through the documentation for MongoDB, which is what Hive-Engine uses, I found the perfect solution: numeric collation.

Bingo!

This feature tells the database to treat a specific text field as a number for the purpose of a single query, without needing to change the data or add new indexes.


The Solution in Action

A quick patch later, and I can now make a single API call to get a perfectly sorted richlist. Even if it adds a tiny bit of overhead for Mongo to perform the query, it has to be better than making 26+ API calls.

Here’s an example of the new API call using the sort parameter:

screenshot-20250925-063127.png

And here is the result: The top 10 PIZZA holders, sorted correctly by their balance, all from a single query (edited for display, ... is where stake, pendingUnstake, delegationsIn, delegationsOut, pendingUndelegations would be).

{
  "jsonrpc": "2.0",
  "id": 1,
  "result": [
    { "_id": 241391, "account": "null", "balance": "1232816.82", ... },
    { "_id": 1739899, "account": "moon.deposit", "balance": "116074.02", ... },
    { "_id": 525742, "account": "tipcc", "balance": "50073.47", ... },
    { "_id": 208477, "account": "ecoinstant", "balance": "18328.40", ... },
    { "_id": 186743, "account": "huzzah", "balance": "17589.11", ... },
    { "_id": 1775991, "account": "moon.deposit2", "balance": "16042.86", ... },
    { "_id": 186723, "account": "thebeardflex", "balance": "15167.60", ... },
    { "_id": 208474, "account": "a1-shroom-spores", "balance": "10497.16", ... },
    { "_id": 1301276, "account": "muterra-cards", "balance": "10080.00", ... },
    { "_id": 634678, "account": "wrestorgonline", "balance": "9880.34", ... }
  ]
}

Under the Hood

The change itself was pretty small. I added logic to the JsonRPCServer.js file to handle the new sort parameter, which accepts a field name and an order (asc or desc). Then, in Database.js, I added the collation option to the find query, enabling numericOrdering: true whenever the sort index is balance or stake.

Right now, this is running on my personal node, and I've submitted the pull request to the main Hive-Engine repository QA branch. I'm hoping others can make use of it as well.

As always,
Michael Garcia a.k.a. TheCrazyGM

0.39455117 BEE
3 comments

Excellent! This will make the data way more coherent, readable, and useful! I look forward to seeing when it goes live. πŸ˜πŸ™πŸ’šβœ¨πŸ€™

0.00020528 BEE

Thank you for letting us know.

0.00019568 BEE
(edited)

I ran into this same problem with Hive Engine Tools and now Hive Analytics.
The easy solution is fetch by ID then use last id rather than offset pagination.

Right now Hive Analytics just uses offset, but when I refactor it, it will pull entire list then use server side sorting from cache.

-0.21880433 BEE

Thanks, I'll probably implement that until if/when that pr goes through.

0.00000000 BEE

Right now Hive Analytics just does offset and gets the top 500 by balance, but if you sort by staked, it will only sort among those 500 user which are not the top 500 staked. This will be resolved later as the main point was for balance rich list.

-0.22553143 BEE

Thanks for mentioning this method. It sped up the smaller tokens tremendously and moderately improved the mid-sized queries, the large token counts still take a long time, but it is substantially better.

0.00337157 BEE