I thought this would be easy.
I’m upgrading my Hive Engine node from "Lite" to "Full." I have the hardware to make this trivial: a Dual Xeon Gold server boasting 80 logical cores and 64GB of RAM, backed by fast SSD storage.
I had a 250GB .archive snapshot ready to go. With that much horsepower, I figured I’d run a standard parallel mongorestore command, grab lunch, and be done by the afternoon.
Instead, I spent the last 24 hours fighting hidden bottlenecks, single-threaded legacy code, and MongoDB’s internal panic triggers.
Here is the autopsy of a restore gone wrong, the "Triple Pincer" method that finally broke the logjam, and the custom tracking script I wrote to keep my sanity.
I started with what I thought was an aggressive command. I told mongorestore to use 16 parallel streams and handle decompression on the fly.
# The naive approach
mongorestore \
-j=16 \
--numInsertionWorkersPerCollection=16 \
--bypassDocumentValidation \
--drop \
--gzip \
--archive=hsc_snapshot.archive
I fired it up and watched the logs. It was moving... but barely.
The Diagnosis: A look at htop revealed the problem immediately. One single CPU core was pegged at 100%, while the other 79 were asleep.
The built-in --gzip flag in MongoDB tools is single-threaded. I had a Ferrari engine, and I was feeding it fuel through a coffee stirrer. It was crunching about 2GB per hour. At that rate, I'd be done next Tuesday.
pigz Pipe (CPU Unleashed)If the built-in tool is the bottleneck, bypass it. I aborted the restore and switched to using pigz (Parallel Implementation of GZip). This uses every available core to decompress the stream and pipes raw BSON straight into mongo’s stdin.
# The "Nuclear" Option
pigz -dc hsc_snapshot.archive | mongorestore \
--archive \
-j=16 \
--numInsertionWorkersPerCollection=10 \
--bypassDocumentValidation \
--drop
CPU usage skyrocketed across all 80 cores. The intake pipe was finally wide open. Data started flying into the database.
Until it didn't.
After about 20 minutes of high speed, the restore started stuttering. It would run fast for 10 seconds, then completely stall for 30 seconds. It was faster than Attempt 1, but painfully inconsistent.
Why was my powerful server stuttering? It wasn't CPU anymore. I ran mongostat 1 to look under the hood of the database engine.
The "smoking gun" was in the dirty column. It was flatlining at 20%.
Here is what that means: MongoDB’s storage engine, WiredTiger, keeps data in RAM (dirty cache) before writing it to disk. It has safety triggers:
My 80 cores were decompressing data so fast that the SSD drive couldn't swallow it quick enough. WiredTiger was throttling my CPU to protect the disk.
I tried to tune this live using db.adminCommand to increase the panic to 30%, but it didn't help much. I was stuck.
db.adminCommand({
"setParameter": 1,
"wiredTigerEngineRuntimeConfig": "eviction_dirty_target=20,eviction_dirty_trigger=30"
})
If I couldn't tune the engine to accept one massive stream, I decided to overwhelm it with three smaller ones.
The Hive Engine database is dominated by two massive collections: hsc.chain and hsc.transactions. When restoring linearly, you hit lock contention as dozens of threads fight over the same collection lock while simultaneously fighting eviction threads.
I aborted everything and launched three simultaneous restore processes in separate terminals.

Terminal 1 (The Chain):
pigz -dc hsc_snapshot.archive | mongorestore --archive --nsInclude="hsc.chain" --drop --numInsertionWorkersPerCollection=10 --bypassDocumentValidation
Terminal 2 (The Transactions):
pigz -dc hsc_snapshot.archive | mongorestore --archive --nsInclude="hsc.transactions" --drop --numInsertionWorkersPerCollection=10 --bypassDocumentValidation
Terminal 3 (Everything Else):
pigz -dc hsc_snapshot.archive | mongorestore --archive --nsExclude="hsc.chain" --nsExclude="hsc.transactions" --drop --numInsertionWorkersPerCollection=10 --bypassDocumentValidation
Why this works:
Yes, this reads the 250GB archive from disk three times simultaneously. But SSD read speeds are practically infinite for this workload.
By splitting the job, I broke the collection-level locks. The process restoring chain doesn't care if the transactions process is paused for cache eviction. It smoothed out the I/O pattern.
The Proof:
Looking at my mongostat now (top right pane in screenshot), the insert rate is holding steady in the thousands, but look at the dirty column. It's still hovering at 15%. But we are not at the 20% panic threshold.
There was one final problem. Because I was piping data via pigz, mongorestore had no idea how big the file was (not that it helps anyway) I had zero progress bars. Restoring hive-engine nodes is a slog and there is nothing to tell you where you are...
Everything is a file, you can see where the kernel is reading from memory with the lsof tool. You can get the exact bytes from the filesystem stat, and with those numbers you can do a little bit of math.

So, I wrote track_restore.sh. This script auto-detects the pigz process, finds the open file descriptor using lsof, reads the byte offset from the kernel, and calculates the real-time progress. It works with the normal mongorestore method as well, and would probably be helpful to other Hive-Engine node operators (even light nodes).
You can see it running, keeping me sane while the gigabytes churn.

#!/bin/bash
# Configuration
INTERVAL=5
# AUTO-DETECT: Check for pigz first (fast mode), then mongorestore (slow mode)
PID=$(pgrep -x "pigz" | head -n 1)
PROC_NAME="pigz"
if [ -z "$PID" ]; then
PID=$(pgrep -x "mongorestore" | head -n 1)
PROC_NAME="mongorestore"
fi
if [ -z "$PID" ]; then
echo "Error: Neither pigz nor mongorestore process found."
exit 1
fi
echo "--- Restore Progress Tracker (V3) ---"
echo "Monitoring Process: $PROC_NAME (PID: $PID)"
# Find file and size
ARCHIVE_PATH=$(lsof -p $PID -F n | grep ".archive$" | head -n 1 | cut -c 2-)
if [ -z "$ARCHIVE_PATH" ]; then
echo "Could not auto-detect .archive file. Is the restore running?"
exit 1
else
TOTAL_SIZE=$(stat -c%s "$ARCHIVE_PATH")
echo "Tracking File: $ARCHIVE_PATH"
fi
TOTAL_GB=$(echo "scale=2; $TOTAL_SIZE / 1024 / 1024 / 1024" | bc)
echo "Total Archive Size: $TOTAL_GB GB"
echo "----------------------------------------"
while true; do
# 1. Get Offset
# 2>/dev/null suppresses "lsof: WARNING" noise
RAW_OFFSET=$(lsof -o -p $PID 2>/dev/null | grep ".archive" | awk '{print $7}')
# 2. Safety Check: If empty, assume finished or closing
if [ -z "$RAW_OFFSET" ]; then
echo -e "\n\nRestore finished! (Process closed file)"
break
fi
# 3. Clean the Offset (The Fix)
# Remove '0t' (decimal prefix) and '0x' (hex prefix) to be safe
# Bash handles 0x, but we can treat everything as standard base-10 if we convert hex
if [[ "$RAW_OFFSET" == 0x* ]]; then
# It's hex (mongorestore style)
CURRENT_BYTES=$((RAW_OFFSET))
else
# It's likely 0t (pigz style) or raw number. Strip 0t.
CURRENT_BYTES=$(echo "$RAW_OFFSET" | sed 's/^0t//')
fi
# 4. Math Safety Check
if [ -z "$CURRENT_BYTES" ]; then continue; fi
# 5. Calculate
PERCENT=$(echo "scale=4; ($CURRENT_BYTES / $TOTAL_SIZE) * 100" | bc)
CURRENT_GB=$(echo "scale=2; $CURRENT_BYTES / 1024 / 1024 / 1024" | bc)
# 6. Bar
BAR_WIDTH=50
# Use 0 if PERCENT is empty to avoid crash
INT_PERCENT=$(echo "${PERCENT:-0}" | cut -d'.' -f1)
# Ensure INT_PERCENT is a number
if ! [[ "$INT_PERCENT" =~ ^[0-9]+$ ]]; then INT_PERCENT=0; fi
FILLED=$(($INT_PERCENT * $BAR_WIDTH / 100))
EMPTY=$(($BAR_WIDTH - $FILLED))
BAR=$(printf "%0.s#" $(seq 1 $FILLED))
SPACE=$(printf "%0.s-" $(seq 1 $EMPTY))
printf "\rProgress: [%s%s] %s%% (%s GB / %s GB)" "$BAR" "$SPACE" "$PERCENT" "$CURRENT_GB" "$TOTAL_GB"
sleep $INTERVAL
done
When you throw enterprise-grade hardware at standard-grade tools, things break in weird ways.
Don't trust defaults. Monitor your bottlenecks. And if the database engine tries to throttle you, sometimes the only answer is to hit it from three directions at once.
The node should finally be synced by mid-day.
As always,
Michael Garcia a.k.a. TheCrazyGM
Amazing! Mongorestore has been the bane of my existence! On NVMEs a full restore was under 20 hours with some tweaks, but your approach should make that even faster!
My bottleneck is still the SSD. I probably should invest in an NVMe at some point. But I'm poor folk. 😅
Aren't consumer NVME and SSDs almost identical in price with SSDs being slightly cheaper?
View more
Now I know I need to meet you in real life. Enough proof from your posts. 😉🙃
I am enjoying following you along as you are going through these processes from the beginning, so cool that before even sync'ing you have an incredibly useful new gist to add to our project builder!
!PAKX
!PIMP
!PIZZA
View or trade
PAKXtokens.Use !PAKX command if you hold enough balance to call for a @pakx vote on worthy posts! More details available on PAKX Blog.
👀 What FS for the mongoDB please :D
I'm using BTRFS with CoW turned off for the mongod dir. but I wanted the ease of the snapshots for backup purposes, so I don't ever had to do this restore ever again...
Interesting... haven't done that for a while. Have you tried in-memory with ZFS delayed writes?
View more
$PIZZA slices delivered:
@ecoinstant(1/20) tipped @thecrazygm
Please vote for pizza.witness!
You're a masterful coding wizard, my friend, and I very much appreciate reading your adventures. Oh, and no, I never trust defaults. 😁🙏💚✨🤙