Mar 1, 2026

⏱️ 3 min read

Multi-Rig Masterclass: Orchestrating a Recording Fleet

Table of Contents

1. Fleet Architecture: The Hub & Spoke Model
Why not just build one giant machine?
2. Shared Storage & 10GbE Networking
The Storage Flow Diagram
Recommended Setup:
3. Worker Node Configuration
config.yaml Node Mapping
Syncing Hooks across the Fleet
4. Remote Monitoring & Management
Monitoring Checklist:
5. Common Fleet Failures (and Mitigation)
Summary: The 3-Rig Test Case

A single powerhouse rig is great, but eventually, you’ll hit a hardware wall. Whether it’s VRAM limits on your GPU, PCIe lane starvation, or thermal throttling, professional archivists eventually move to a Multi-Rig Fleet.

This masterclass outlines the architecture, networking, and configuration required to manage a 3+ rig recording cluster.

1. Fleet Architecture: The Hub & Spoke Model

For a reliable fleet, we recommend the Hub & Spoke model. This separates the “Brain” from the “Muscle”:

The Monitor Node (The Hub): A low-power machine (even a NUC or Laptop) running the Multi-Rig Monitor. It polls the workers and displays the unified HUD.
The Worker Nodes (The Spoke): High-performance machines (dedicated GPUs) that handle the actual FFmpeg ingest and disk writes.
The Storage Node (The Sink): A dedicated NAS or ZFS server where all recordings are saved via 10GbE.

Why not just build one giant machine?

Redundancy. If one rig in a 3-rig fleet fails, you only lose 33% of your capture capacity. If your “mega-rig” motherboard dies, you are 100% dark.

2. Shared Storage & 10GbE Networking

In a multi-rig setup, Network I/O is the primary bottleneck. 1GbE networking is insufficient for writing 10+ concurrent 4K streams to a NAS.

The Storage Flow Diagram

[Worker 1] --(10GbE)--> [Core Switch] <--(10GbE)-- [Storage Node (ZFS)] [Worker 2] --(10GbE)--> [Core Switch]

Recommended Setup:

Networking: Intel X540-T2 or ConnectX-3 10GbE NICs in every worker.
Protocol: NFS (Linux) or SMB Direct with RDMA (Windows).
ZFS Configuration: Use a 1MB recordsize on your archive pool to match CaptureGem’s large sequential write pattern.

3. Worker Node Configuration

Each worker needs a unique identity but a shared configuration logic.

`config.yaml` Node Mapping

# Worker Node 01
node_id: "rig-alpha-01"
api:
  port: 8080
  allow_remote: true
storage:
  path: "/mnt/archive/node-01/" # Map to NAS share
  temp_path: "/local/nvme/staging/" # Record locally, move on finish

Syncing Hooks across the Fleet

Don’t manually copy your scripts. Use a simple rsync or Git-based approach to ensure all workers are running the latest Automation Library scripts.

# Example: Syncing hooks from your Monitor Node
rsync -avz /home/user/master_hooks/ worker-01:/home/user/capturegem/hooks/

4. Remote Monitoring & Management

Enable the Monitor API on every node. Then, add them to your Unified HUD.

Monitoring Checklist:

Heartbeat: Does every node report “LIVE”?
IO Wait: Use iostat to ensure the NAS isn’t stalling.
Entropy: Ensure your Proxy rotation isn’t using the same IP across multiple workers (this triggers site-level bans).

5. Common Fleet Failures (and Mitigation)

Failure	Impact	Mitigation
Split-Brain	Two nodes trying to record the same model.	Centralize your watchlist management using the CaptureGem API.
Storage Lock	NAS pool hits 100% utilization.	Implement Automated Library Pruning on the Storage Node.
PCIe Lane Saturation	Worker drops frames despite low CPU.	Ensure NIC and GPU are on separate CPU-direct PCIe lanes.

Summary: The 3-Rig Test Case

Rig A (Ingest): 1080p Standard resolution captures (40+ streams).
Rig B (High-Fidelity): 4K/8K High-bitrate captures (10+ streams).
Rig C (Transcode): Handles post-processing and Cloud Archival.

By specializing your hardware, you create a workflow that is impossible to achieve on a single workstation.

Related guides

Rate this guide

Loading ratings...

Was this guide helpful?

What was confusing? (optional)

Comments

Loading comments...