HiveTools Post Curation Tool Now Filters By Language

(edited)

Hey everyone,

This is one of those updates where the feature itself is useful, but the story behind it is slightly ridiculous.

The Post Curation Tool in HiveTools now has a language filter.

That means you can take the existing curation filters, author, tags, title, community, source app, word count, read time, and now narrow the results by detected post language.

Want to find Korean posts from the last week?

Pick Korean from the dropdown.

lang1.png

Want to look for Spanish photography posts, Japanese art posts, or English long-form posts from a specific community?

Same tool. Same search flow. One more filter.

The Funny Part

The funny part is that the backend data for this has been there for months.

The block indexing script that feeds the database has been detecting language since last Halloween.

Last Halloween.

As in, pumpkins, candy, costumes, the whole thing.

And here we are heading into summer before I finally remembered to expose it in the front end.

That is not an architecture problem.

That is just me being a buffoon.

What Changed

The curation sidebar now includes a Language dropdown.

Right now it supports the languages I wanted available directly in the UI:

  • English
  • Spanish
  • French
  • German
  • Portuguese
  • Italian
  • Korean
  • Japanese
  • Chinese

When you choose one, the Post Curation Tool adds that language to the database query and only returns matching posts from the recent indexed set.

The results also show the detected language code on the post cards, so you can see what the system matched.

For example, filtering by Korean gives me a clean list of posts marked KO, instead of making me brute-force it by tag, community, or guesswork.

lang2.png

That is the whole point of this tool: less guessing, more useful discovery.

How The Language Data Gets There

HiveTools is not asking Hive posts to self-identify their language.

The database is fed by my block-processing script, which runs on a schedule and stores useful post metadata locally.

As it processes top-level posts, it pulls the body text, strips out the obvious noise, Markdown, HTML, links, code blocks, mentions, hashtags, and similar clutter, then runs language detection on the remaining plain text.

For that detection step I am using lingua-language-detector.

The script stores:

  • the detected language code
  • the confidence value when available
  • the top confidence candidates when available

The curation tool itself mostly cares about the language code. If the detector says a post is Korean, that post gets stored with lang: "ko", and the front end can now query against it.

Short posts and posts without enough usable text can still end up as unknown or undetected, which is fine. I would rather avoid pretending a tiny bit of text is more certain than it is.

Why This Matters For Curation

Hive is not an English-only chain.

That is obvious if you spend any real time looking around, but a lot of discovery tools still make non-English curation more awkward than it needs to be.

Tags help, but tags are inconsistent.

Communities help, but communities are not always language-specific.

Apps help, but apps do not tell you what language the author wrote in.

Language filtering gives curators another clean dimension to search by.

You can combine it with the rest of the tool:

  • Korean posts in the last week
  • Spanish posts with photography tags
  • Portuguese posts over 500 words
  • Japanese posts in a specific community
  • English posts excluding a noisy author or tag

That makes the Post Curation Tool better at what I actually built it for: finding posts worth looking at without having to manually dig through the whole firehose.

The Short Version

HiveTools now has language filtering in the Post Curation Tool.

The backend has apparently had the data since Halloween.

I forgot to add the front-end control until now.

So the good news is: the feature works.

The bad news is: I had the ingredients sitting in the fridge for months and only just remembered to cook dinner.

EDIT: I have a really, really bad habit of forgetting to share the tool: https://tools.crypto-dreamr.com/pct

As always,
Michael Garcia a.k.a. TheCrazyGM

0.12725783 BEE
6 comments

That's a very useful (finally-implemented) filtering mechanism for a curation tool indeed. I know it's probably on your list, but there are a fair number of languages that are quite commonly used on Hive that are not yet in your current list, like Polish, Dutch, Malay/Indonesian, Tagalog/Filipino, Arabic, and even Turkish. 😁🙏💚✨🤙

0.00056516 BEE

Consider it done!

0.00000000 BEE

But we can definitely say that your cooking is worth waiting for!!

!PAKX
!PIMP
!PIZZA

0.00055694 BEE

View or trade PAKX tokens.

@ecoinstant, PAKX has voted the post by @thecrazygm. (1/2 calls)



Use !PAKX command if you hold enough balance to call for a @pakx vote on worthy posts! More details available on PAKX Blog.

0.00000000 BEE

The language filter is one of those features that sounds small until you actually need it. Korean curation without it means scrolling past everything else and guessing — not ideal.

Been watching HiveTools evolve. The backend data existing since Halloween and the frontend being the bottleneck is the kind of dev gap that's easy to laugh at but every solo builder recognizes.

One question: does the language detection handle multilingual posts gracefully, or does it pick the dominant language and tag accordingly?

0.00055131 BEE

It actually picks the dominant language. (there is data saved with confidence scores on which one it thinks, e.g. en: 0.3, es: 0.6, which might be useful for multi language posts)

0.00000000 BEE

PIZZA!

$PIZZA slices delivered:
@ecoinstant(1/20) tipped @thecrazygm

Send $PIZZA tips in Discord via tip.cc!

0.00000000 BEE

Language filtering is one of those features that sounds simple but is genuinely hard to implement well — especially on Hive where post metadata isn't always reliable. If the detection is accurate, this solves a real pain point for curators who want to support content they can actually evaluate. Does it use NLP detection or metadata tags?

0.00000000 BEE

Nice addition 😍

I was wondering whether any of these tools could be embedded in a post and, if they could, how updating them with any changes would work?

I'm interested in the Post Search (backlinks) tool which I'd like to include in a custom search page with a tag table/cloud.

0.00000000 BEE