Quant Trading Infrastructure & Systems Architecture

All production systems operate within Equinix NY4 (Secaucus, NJ) and NY5 (Parsippany, NJ) — the primary co-location facilities for NASDAQ (Carteret) and NYSE (Mahwah) matching engines respectively. Physical proximity to exchange matching engines is the foundational latency advantage.

NASDAQ Carteret
NY4 / Equinix
Secaucus, NJ

~120 µs

NYSE Mahwah

NY5 / Equinix

Parsippany, NJ

Cross-venue wire latency between Carteret and Mahwah is approximately 120 µs. This inter-venue delta is the primary exploitable window in Cross-Venue Latency Arbitrage strategies. Our infrastructure maintains dedicated cross-connects within Equinix to minimize intra-facility hops.

Parameter	Description	Value
Primary Venue	NASDAQ Matching Engine Distance	< 1 rack unit
Secondary Venue	NYSE Arca Cross-Connect	NY4 → NY5
Cross-venue RTT	Carteret ↔ Mahwah Wire	~120 µs
Internal Hop	Intra-facility cross-connect	< 300 ns
Power Redundancy	UPS + Generator	2N

Standard Linux kernel network stacks introduce 5–50 µs of latency due to context switches, interrupt handling, and system call overhead. We bypass this entirely using Solarflare XtremeScale SFN8522 NICs with OpenOnload user-space network stack.

Kernel Bypass Architecture

OpenOnload intercepts socket calls at the application layer via LD_PRELOAD injection, redirecting traffic directly to NIC hardware queues via RDMA-style zero-copy DMA. The kernel is bypassed entirely for data-path operations — interrupts are replaced by polling loops pinned to isolated CPU cores.

nic_init.cpp C++17

// Solarflare OpenOnload — Zero-copy RX path initialization
int init_onload_stack(OnloadStack* stack) {
    // Create OpenOnload stack with polling mode
    int rc = ef_driver_open(&stack->dh);
    ef_pd_alloc(&stack->pd, stack->dh,
                stack->ifindex, EF_PD_DEFAULT);

    // Allocate hugepage-backed RX/TX ring buffers
    ef_memreg_alloc(&stack->mr, stack->dh,
                    &stack->pd, stack->dh,
                    stack->buf, BUF_SIZE);

    // Pin polling thread to isolated core (no OS scheduling)
    pin_thread_to_core(POLL_CORE_ID);
    return rc;
}

Component	Specification	Latency Impact
NIC Model	Solarflare XtremeScale SFN8522	Baseline
Network Stack	OpenOnload 7.x (user-space)	−5 to −50 µs
DMA Mode	Zero-copy RDMA	−800 ns
Interrupt Model	Busy-poll (no IRQ)	−2 µs
CPU Affinity	Isolated core pinning	−1.2 µs

Standard 4KB memory pages cause frequent Translation Lookaside Buffer (TLB) misses during high-throughput tick processing. Each TLB miss triggers a costly page-table walk (~100 ns penalty). We eliminate this by allocating all critical data structures on 2MB Hugepages via MAP_HUGETLB | MAP_LOCKED.

TLB Miss Reduction Factor $$\text{TLB}_{\text{miss}} \approx \frac{N_{\text{pages}}^{\text{4KB}}}{N_{\text{pages}}^{\text{2MB}}} = \frac{2048 \cdot 1024}{2 \cdot 1024 \cdot 1024} = \frac{1}{512}$$

By switching from 4KB to 2MB pages, the number of TLB entries required for a given working set shrinks by a factor of 512×, effectively eliminating TLB pressure for tick data buffers up to 4GB.

memory_alloc.cpp C++17

// Hugepage-backed ring buffer for tick data
void* alloc_hugepage_ring(size_t size) {
    void* mem = mmap(nullptr, size,
                     PROT_READ | PROT_WRITE,
                     MAP_PRIVATE | MAP_ANONYMOUS |
                     MAP_HUGETLB | MAP_LOCKED,   // 2MB pages, no swap
                     -1, 0);

    if (mem == MAP_FAILED)
        fallback_standard_alloc(size);

    // Pre-fault all pages to avoid runtime page faults
    madvise(mem, size, MADV_WILLNEED);
    mlock(mem, size);
    return mem;
}

NUMA Topology

All NIC queues, tick buffers, and order-state data structures are allocated on NUMA node 0 — the same node as the Solarflare NIC PCIe slot. Cross-NUMA memory access introduces ~80 ns additional latency per cache miss, which is unacceptable at the nanosecond scale.

The total wire-to-order latency target is < 800 ns. This budget is decomposed across each pipeline stage, with hard ceilings enforced via hardware timestamps on every ingress packet.

Wire Ingress

~0 ns

NIC RX timestamp

→

DMA Transfer

~120 ns

PCIe Gen4 DMA

→

ITCH Parse

~180 ns

Zero-alloc parser

→

VPIN Update

~240 ns

Cache-local kernel

→

Signal / Order

< 800 ns

FIX TX egress

Total Latency Budget Decomposition $$L_{\text{total}} = L_{\text{DMA}} + L_{\text{parse}} + L_{\text{compute}} + L_{\text{signal}} < 800\text{ ns}$$

Stage	Component	Budget
NIC → CPU	PCIe Gen4 DMA, zero-copy	120 ns
Deserialize	NASDAQ ITCH 5.0 parser	60 ns
State Update	Order book L2 update	80 ns
VPIN Compute	Inline kernel, L1-cached	160 ns
Decision	Signal evaluation	180 ns
TX Egress	FIX/OUCH order submission	120 ns
Total P50		720 ns
Total P99		740 ns

All telemetry, metrics, and strategy parameters transmitted between co-location and monitoring endpoints are encrypted using AES-256-GCM with a rotating ECDH (Curve25519) key exchange. GCM mode provides authenticated encryption — both confidentiality and integrity are guaranteed for every telemetry frame.

telemetry_crypto.cpp C++17 / OpenSSL

// AES-256-GCM authenticated encryption for telemetry frames
bool encrypt_telemetry_frame(
    const uint8_t* plaintext, size_t pt_len,
    uint8_t* ciphertext, uint8_t* tag)
{
    EVP_AEAD_CTX ctx;
    // 256-bit session key derived via ECDH Curve25519
    EVP_AEAD_CTX_init(&ctx, EVP_aead_aes_256_gcm(),
                      session_key, 32, TAG_LEN, nullptr);

    return EVP_AEAD_CTX_seal(&ctx,
        ciphertext, &out_len, pt_len + TAG_LEN,
        nonce, NONCE_LEN,
        plaintext, pt_len,
        aad, aad_len) == 1;
}

Nonce values are derived from hardware timestamp counters (RDTSC) combined with a per-session counter, ensuring nonce uniqueness across all telemetry streams. Key rotation occurs every 3600 seconds via a dedicated ECDH re-handshake triggered by the monitoring daemon.

HFT Infrastructure Architecture

Co-location & Data Center Topology

NIC & Kernel Bypass (OpenOnload)

Kernel Bypass Architecture

Memory Architecture — Hugepages & TLB

NUMA Topology

End-to-End Latency Budget

Security & Telemetry Encryption