Hey everyone,
I've been working on improving my Market-Viewr tool, specifically trying to find a more natural way to get a token's richlist. Right now, the process is pretty inefficient.
The core issue is twofold. First, the token balances are returned sorted alphabetically by account name, not by the balance amount. Second, the API has a hard limit of 10,000 results per query. For tokens with a huge number of holders, like SWAP.HIVE or PIZZA, this means you can't simply paginate through the entire list to find the top holders; you're cut off after the first 10k accounts. This is why I had to create a "hacky" workaround that makes at least 26 separate API calls, at least one for each letter of the alphabet (sometimes more for letters like 'S'), just to try and piece the full picture together.
I thought to myself, "There has to be a better way." So, I started digging into the Hive-Engine node codebase to see if I could just add a sort
parameter to the API call. The idea was simple, but I quickly ran into the core problem: the token balances are stored in the database as text strings, not numbers. This is done to preserve the decimal precision so they can be used with big math libraries, but it also means a standard database sort won't work correctly (for example, "100" would come before "2").
After reading through the documentation for MongoDB, which is what Hive-Engine uses, I found the perfect solution: numeric collation.
Bingo!
This feature tells the database to treat a specific text field as a number for the purpose of a single query, without needing to change the data or add new indexes.
A quick patch later, and I can now make a single API call to get a perfectly sorted richlist. Even if it adds a tiny bit of overhead for Mongo to perform the query, it has to be better than making 26+ API calls.
Hereβs an example of the new API call using the sort
parameter:
And here is the result: The top 10 PIZZA holders, sorted correctly by their balance, all from a single query (edited for display, ...
is where stake, pendingUnstake, delegationsIn, delegationsOut, pendingUndelegations
would be).
{
"jsonrpc": "2.0",
"id": 1,
"result": [
{ "_id": 241391, "account": "null", "balance": "1232816.82", ... },
{ "_id": 1739899, "account": "moon.deposit", "balance": "116074.02", ... },
{ "_id": 525742, "account": "tipcc", "balance": "50073.47", ... },
{ "_id": 208477, "account": "ecoinstant", "balance": "18328.40", ... },
{ "_id": 186743, "account": "huzzah", "balance": "17589.11", ... },
{ "_id": 1775991, "account": "moon.deposit2", "balance": "16042.86", ... },
{ "_id": 186723, "account": "thebeardflex", "balance": "15167.60", ... },
{ "_id": 208474, "account": "a1-shroom-spores", "balance": "10497.16", ... },
{ "_id": 1301276, "account": "muterra-cards", "balance": "10080.00", ... },
{ "_id": 634678, "account": "wrestorgonline", "balance": "9880.34", ... }
]
}
The change itself was pretty small. I added logic to the JsonRPCServer.js
file to handle the new sort
parameter, which accepts a field name and an order (asc
or desc
). Then, in Database.js
, I added the collation
option to the find
query, enabling numericOrdering: true
whenever the sort index is balance
or stake
.
Right now, this is running on my personal node, and I've submitted the pull request to the main Hive-Engine repository QA branch. I'm hoping others can make use of it as well.
As always,
Michael Garcia a.k.a. TheCrazyGM
Excellent! This will make the data way more coherent, readable, and useful! I look forward to seeing when it goes live. πππβ¨π€
Thank you for letting us know.
I ran into this same problem with Hive Engine Tools and now Hive Analytics.
The easy solution is fetch by ID then use last id rather than offset pagination.
Right now Hive Analytics just uses offset, but when I refactor it, it will pull entire list then use server side sorting from cache.
Thanks, I'll probably implement that until if/when that pr goes through.
Right now Hive Analytics just does offset and gets the top 500 by balance, but if you sort by staked, it will only sort among those 500 user which are not the top 500 staked. This will be resolved later as the main point was for balance rich list.
View more
Thanks for mentioning this method. It sped up the smaller tokens tremendously and moderately improved the mid-sized queries, the large token counts still take a long time, but it is substantially better.