6 months after my previous post on this topic I finally found some time during the holidays to give it another try. With the last attempt I discovered that the streamer is fetching blocks synchronously and worked very slow. I wasted some time trying to improve it, even developed a simple caching proxy in Python to pre-fetch blocks needed by the Streamer. This was actually a speed up, but the I noticed that I'm using the wrong Git repository for the Hive Engine source code. The one in the FAQ is wrong and abandoned, the working repo is https://github.com/hive-engine/steemsmartcontracts.git. It has even an open issue describing how to set up block pre-fetching by modifying a single constant in the source. So I took advantage of this wisdom and now my Node is catching up with 5-10 blocks per second. It will probably take more than a month to complete but at least there is some hope.
Again I'm running everything on Docker images, not sure how this affects performance. I'm a bit worried about the Mongo container as it is running on an OrangePi board with 1G RAM and usb stick for a storage. Currently it's working fine, I certainly have to migrate it to more capable hardware in the future