The Hive-Engine Failover PR is Live After a Week of Successful Testing

Hey everyone,

In my previous post, I talked about why Hive-Engine nodes sometimes "limp along" instead of failing over cleanly when a primary RPC goes bad. I pitched two paths: a minimal operational fix (Option A) and a broader architecture redesign (Option B).

The consensus (and my own gut feeling for a first step) was clear: make it work first.

I’ve spent the last week running that "Option A" fix on a live node, and I’m happy to report that the PR to the QA branch is now open.

Pull Request #134

The PR is here:
https://github.com/hive-engine/hivesmartcontracts/pull/134

What’s in the PR?

I didn't just dump a theory into a pull request. I wanted a result that an operator could actually rely on. The final implementation includes:

  1. Request-level failover: Block reads now treat your streamNodes list as a proper failover chain. If one fetch fails, it tries the next node immediately instead of hanging.
  2. Scheduler-level demotion: If a node fails repeatedly, the scheduler "cools it down" and gives other nodes a shot for a short window. This was the key to making the rollover feel decisive.
  3. Shutdown & Reliability: I also pulled in fixes for graceful shutdown (signal propagation and increased timeouts) and an npm audit cleanup to keep the branch clean.

The Result: Real-World Stress Testing

This wasn't just a "looks good on my machine" test. I ran this on a production node for over a week and intentionally simulated failures:

  • Firewall blocking: I blocked api.hive.blog at the OS level while the node was running.
  • The outcome: The node stayed perfectly caught up. The logs showed the rollover happening in real-time, the bad node was demoted, and the healthy alternates took over the load without a hitch.

Next Steps

This PR is a practical, short-term fix to solve the immediate "limping node" problem. It doesn't preclude a larger redesign later, but it stops the bleeding now.

If you run a node or care about the stability of the sidechain, I’d love for you to take a look at the code and the testing results.

Review the PR on GitHub

As always,
Michael Garcia a.k.a. TheCrazyGM

0.35167474 BEE
2 comments

Excellent work once again! I think this will be truly helpful for our witness nodes :)

!PAKX
!PIMP
!PIZZA

0.00158796 BEE

View or trade PAKX tokens.

@ecoinstant, PAKX has voted the post by @thecrazygm. (1/2 calls)



Use !PAKX command if you hold enough balance to call for a @pakx vote on worthy posts! More details available on PAKX Blog.

0.00000000 BEE

PIZZA!

$PIZZA slices delivered:
@ecoinstant(1/20) tipped @thecrazygm

Send $PIZZA tips in Discord via tip.cc!

0.00000000 BEE