Multi-Rig Masterclass: Orchestrating a Recording Fleet guide illustration
⏱️ 3 min read

Multi-Rig Masterclass: Orchestrating a Recording Fleet


Table of Contents

A single powerhouse rig is great, but eventually, you’ll hit a hardware wall. Whether it’s VRAM limits on your GPU, PCIe lane starvation, or thermal throttling, professional archivists eventually move to a Multi-Rig Fleet.

This masterclass outlines the architecture, networking, and configuration required to manage a 3+ rig recording cluster.

1. Fleet Architecture: The Hub & Spoke Model

For a reliable fleet, we recommend the Hub & Spoke model. This separates the “Brain” from the “Muscle”:

  1. The Monitor Node (The Hub): A low-power machine (even a NUC or Laptop) running the Multi-Rig Monitor. It polls the workers and displays the unified HUD.
  2. The Worker Nodes (The Spoke): High-performance machines (dedicated GPUs) that handle the actual FFmpeg ingest and disk writes.
  3. The Storage Node (The Sink): A dedicated NAS or ZFS server where all recordings are saved via 10GbE.

Why not just build one giant machine?

Redundancy. If one rig in a 3-rig fleet fails, you only lose 33% of your capture capacity. If your “mega-rig” motherboard dies, you are 100% dark.

2. Shared Storage & 10GbE Networking

In a multi-rig setup, Network I/O is the primary bottleneck. 1GbE networking is insufficient for writing 10+ concurrent 4K streams to a NAS.

The Storage Flow Diagram

[Worker 1] --(10GbE)--> [Core Switch] <--(10GbE)-- [Storage Node (ZFS)] [Worker 2] --(10GbE)--> [Core Switch]

  • Networking: Intel X540-T2 or ConnectX-3 10GbE NICs in every worker.
  • Protocol: NFS (Linux) or SMB Direct with RDMA (Windows).
  • ZFS Configuration: Use a 1MB recordsize on your archive pool to match CaptureGem’s large sequential write pattern.

3. Worker Node Configuration

Each worker needs a unique identity but a shared configuration logic.

config.yaml Node Mapping

# Worker Node 01
node_id: "rig-alpha-01"
api:
  port: 8080
  allow_remote: true
storage:
  path: "/mnt/archive/node-01/" # Map to NAS share
  temp_path: "/local/nvme/staging/" # Record locally, move on finish

Syncing Hooks across the Fleet

Don’t manually copy your scripts. Use a simple rsync or Git-based approach to ensure all workers are running the latest Automation Library scripts.

# Example: Syncing hooks from your Monitor Node
rsync -avz /home/user/master_hooks/ worker-01:/home/user/capturegem/hooks/

4. Remote Monitoring & Management

Enable the Monitor API on every node. Then, add them to your Unified HUD.

Monitoring Checklist:

  • Heartbeat: Does every node report “LIVE”?
  • IO Wait: Use iostat to ensure the NAS isn’t stalling.
  • Entropy: Ensure your Proxy rotation isn’t using the same IP across multiple workers (this triggers site-level bans).

5. Common Fleet Failures (and Mitigation)

FailureImpactMitigation
Split-BrainTwo nodes trying to record the same model.Centralize your watchlist management using the CaptureGem API.
Storage LockNAS pool hits 100% utilization.Implement Automated Library Pruning on the Storage Node.
PCIe Lane SaturationWorker drops frames despite low CPU.Ensure NIC and GPU are on separate CPU-direct PCIe lanes.

Summary: The 3-Rig Test Case

  1. Rig A (Ingest): 1080p Standard resolution captures (40+ streams).
  2. Rig B (High-Fidelity): 4K/8K High-bitrate captures (10+ streams).
  3. Rig C (Transcode): Handles post-processing and Cloud Archival.

By specializing your hardware, you create a workflow that is impossible to achieve on a single workstation.

Related guides

Rate this guide

Loading ratings...

Was this guide helpful?

Comments