§ Technology CWS · Infrastructure · Resilience

A research engine
that writes itself.

CWS — Coding Workflow System — is our four-layer stack for taking a market intuition all the way to executed orders. Every layer is engineered to compound: research feeds production, production feeds research.

§ 01 — Thesis

The bottleneck of quant is no longer ideas.

The constraint isn't generating hypotheses — strong researchers and modern LLMs produce more than any desk can vet. The bottleneck is the loop: cleanly tested, properly stressed, fairly sized, honestly monitored, and gracefully retired.

CWS treats that loop as the product. The system itself is what we engineer, not the individual strategies that live on top. Strategies come and go. The loop compounds.

§ 02 — CWS in three movements

Three engines. One feedback loop.

01 / 03

AI coding & strategy generation

Harness-engineered LLM agents read the latest literature, propose signals, write the implementation, run the unit tests and iterate. A quant year becomes a quant week.

02 / 03

ETL & elastic compute

Continuous extraction and normalisation across exchanges, vendors and alt-data feeds. GPU/CPU schedulers allocate capacity by deadline, not by team.

03 / 03

Research pipeline & RL

Reinforcement-learning loops compound returns of the meta-system itself: which signals to spend compute on, which markets deserve a re-look, when to retire a strategy.

§ 03 — Research method

The pipeline learns to rewrite itself.

Our quants hand-build the canonical pipeline a strategy travels — ingest, clean, feature, backtest, size, execute. That pipeline is the syllabus. Every well-defined sub-problem inside it becomes a self-contained module with explicit inputs, outputs, and acceptance tests.

A supervisor AI watches every execution trace those modules generate in real work, mines the history for blind spots and missed shortcuts, and proposes patches to the offending module's settings and skill registry. The same task is then re-run as an ablation; the patch ships only if Δ clears the noise floor.

STEP 01

Expert pipeline

Quants hand-build the canonical journey a strategy takes from idea to live order. It encodes how we already ship at SigmaFi.
STEP 02

Modular decomposition

Each well-defined sub-problem becomes a self-contained module: explicit IO contract, acceptance tests, settings, skill registry. Independently iterable.
STEP 03

Supervisor AI

A meta-agent ingests every execution trace from real work, mines them for failure patterns, and writes patches to the offending module's settings and skills.
STEP 04

Ablation re-run

The patched module reruns the exact same task on the exact same data. We compare bit-exactly to the baseline. Ship only if Δ clears noise; reject otherwise.

Self-improving modules

~ 120

across the pipeline

Ablations / week

~ 800

supervisor-initiated

Patch acceptance

~ 31%

Δ clears noise floor

Idea → live

6 days

median, since adopting CWS

§ 04 — Architecture

Four layers, all bit-exactly replayable.

From a researcher typing a hypothesis to an order hitting an exchange's matching engine — four layers, owned by one team, replayed end-to-end every night against the day's traffic.

L4 LAYER

Research surface

A notebook-meets-IDE built around CWS. Researchers describe a hypothesis, the harness runs the literature scan, drafts the signal, and queues it for vector backtest.
- σ-shell
- Notebook
- Strategy DSL
- Harness agents
L3 LAYER

Compute & data plane

Petabyte-scale market and alt-data lake, GPU + CPU clusters, deterministic backtests, capacity-aware schedulers. The same store powers research and live trading.
- Tick archive
- GPU cluster
- Replay engine
- Feature store
L2 LAYER

Execution & risk

Kernel-bypass networking, custom exchange adapters, low-latency OMS, and an independent risk system with hard per-strategy limits and global kill-switches.
- Adapters
- OMS
- Smart order router
- Risk core
L1 LAYER

Observability & replay

Every order ever sent is captured bit-exactly. Production traffic can be replayed against a candidate strategy in milliseconds, so we ship with confidence.
- Lineage
- Tracing
- Bit-exact replay
- Audit

§ 05 — Engineering

Latency is a design choice.

Our former Getco engineers designed the order path to be honest about every nanosecond. We make the trade-offs explicit: human-readable Rust for the slow path, hand-tuned C++ and FPGA offload for the hot loop.

Tick-to-trade

< 200 μs

internal hot path

OMS p99

< 1.2 ms

across regions

Replay throughput

> 1 Tb/h

bit-exact

Backtests / day

~ 4,000

vector + tick

Languages

Rust · C++ · Py

ad-hoc Mojo

Regions

HKG · TYO · NYC

active-active

§ 06 — Resilience

Boring infrastructure. Calm desks.

Deterministic replay

Every external interaction is captured with nanosecond timestamps. Yesterday's full trading day can be replayed against a code change in minutes.
Active-active regions

Hong Kong is anchor, Tokyo and New York are mirror sites. Failover is exercised weekly during low-impact windows.
Independent risk

Risk core runs on its own infrastructure, its own deploy cadence, its own oncall. No strategy ever bypasses it.
Audit by default

Order lineage, model versions, data versions and parameter values are all written to an append-only ledger.

Engineers built this. Engineers should run it.

Open roles

A research enginethat writes itself.

The bottleneck of quant is no longer ideas.

Three engines. One feedback loop.

AI coding & strategy generation

ETL & elastic compute

Research pipeline & RL

The pipeline learns to rewrite itself.

Expert pipeline

Modular decomposition

Supervisor AI

Ablation re-run

Four layers, all bit-exactly replayable.

Research surface

Compute & data plane

Execution & risk

Observability & replay

Latency is a design choice.

Boring infrastructure. Calm desks.

Deterministic replay

Active-active regions

Independent risk

Audit by default

Engineers built this. Engineers should run it.

A research engine
that writes itself.