Active-active. No master, no failover.
PMG, Hornetsecurity, Mimecast, Proofpoint — they all have either master/replica architectures with failover drama or central cloud back-ends. NetCell MailGuard is the only SMTP gateway that genuinely clusters active-active without a master.
How does it work?
Every node in the cluster is an equal peer. Configuration changes and detection state are synchronised encrypted between all nodes in the background — the operator makes a change in the web UI, every node picks it up automatically. Inbound mail is distributed across nodes via the MX record in DNS; if one fails, the others take over without intervention.
Why not master/replica?
Classical master/replica architectures have three structural problems:
- Promote complexity. When the master fails, a replica must actively be promoted to master — and every configuration touchpoint has to follow. That doesn't always go smoothly.
- Split-brain risk. Under a network partition both masters believe they are the only active node — both accept mail, and reconciliation afterwards is manual work.
- Quorum overhead. Avoiding split-brain typically requires an odd number of nodes — the extra node costs money without adding throughput.
MailGuard solves this through radical symmetry: every node is master, there is no "correct" node you could lose.
What does this mean operationally?
- Hardware failure → no action needed. The remaining nodes keep serving, routing automatically skips the failed one.
- Add a node → one command. The new node pulls the configuration from the existing cluster on first start and is productive within a minute.
- Maintenance reboot → no failover plan. Drain a node from routing, reboot, put it back in. The others process mail in the meantime.
- Cluster-wide quarantine view. The web UI shows all quarantined mail in one place — the admin sees the full quarantine in a single view, regardless of which node a mail physically resides on.
┌──────────────┐ encrypted ┌──────────────┐
│ Node 1 │ ◀────sync─────▶ │ Node 2 │
│ Detection │ │ Detection │
│ stack │ ◀──────────────▶│ stack │
│ + sandbox │ │ + sandbox │
└──────┬───────┘ └──────┬───────┘
│ │
│ ┌──────────────┐ │
◀──── │ Node 3 │ ◀─────────┤
│ │ ... │ │
│ └──────────────┘ │
│ │
└──────── MX round-robin ────────┘
│
MX 10 mx1.example.com
MX 10 mx2.example.com
MX 10 mx3.example.com
│
▼
Existing mail server
Limit?
None. In practice customers run clusters with 2 to 10 nodes. Replication load scales linearly with the node count; at a typical configuration-change frequency of a handful of updates per hour it is negligible. Very large setups segment geographically — please reach out at the enterprise level.
Want to try a cluster?
Two VMs, two one-liners, cluster running. No quorum, no promote script, no magic.
Start a test