AI coding & strategy generation
Harness-engineered LLM agents read the latest literature, propose signals, write the implementation, run the unit tests and iterate. A quant year becomes a quant week.
CWS — Coding Workflow System — is our four-layer stack for taking a market intuition all the way to executed orders. Every layer is engineered to compound: research feeds production, production feeds research.
§ 01 — Thesis
The constraint isn't generating hypotheses — strong researchers and modern LLMs produce more than any desk can vet. The bottleneck is the loop: cleanly tested, properly stressed, fairly sized, honestly monitored, and gracefully retired.
CWS treats that loop as the product. The system itself is what we engineer, not the individual strategies that live on top. Strategies come and go. The loop compounds.
§ 02 — CWS in three movements
Harness-engineered LLM agents read the latest literature, propose signals, write the implementation, run the unit tests and iterate. A quant year becomes a quant week.
Continuous extraction and normalisation across exchanges, vendors and alt-data feeds. GPU/CPU schedulers allocate capacity by deadline, not by team.
Reinforcement-learning loops compound returns of the meta-system itself: which signals to spend compute on, which markets deserve a re-look, when to retire a strategy.
§ 03 — Research method
Our quants hand-build the canonical pipeline a strategy travels — ingest, clean, feature, backtest, size, execute. That pipeline is the syllabus. Every well-defined sub-problem inside it becomes a self-contained module with explicit inputs, outputs, and acceptance tests.
A supervisor AI watches every execution trace those modules generate in real work, mines the history for blind spots and missed shortcuts, and proposes patches to the offending module's settings and skill registry. The same task is then re-run as an ablation; the patch ships only if Δ clears the noise floor.
Quants hand-build the canonical journey a strategy takes from idea to live order. It encodes how we already ship at SigmaFi.
Each well-defined sub-problem becomes a self-contained module: explicit IO contract, acceptance tests, settings, skill registry. Independently iterable.
A meta-agent ingests every execution trace from real work, mines them for failure patterns, and writes patches to the offending module's settings and skills.
The patched module reruns the exact same task on the exact same data. We compare bit-exactly to the baseline. Ship only if Δ clears noise; reject otherwise.
Self-improving modules
~ 120
across the pipeline
Ablations / week
~ 800
supervisor-initiated
Patch acceptance
~ 31%
Δ clears noise floor
Idea → live
6 days
median, since adopting CWS
§ 04 — Architecture
From a researcher typing a hypothesis to an order hitting an exchange's matching engine — four layers, owned by one team, replayed end-to-end every night against the day's traffic.
A notebook-meets-IDE built around CWS. Researchers describe a hypothesis, the harness runs the literature scan, drafts the signal, and queues it for vector backtest.
Petabyte-scale market and alt-data lake, GPU + CPU clusters, deterministic backtests, capacity-aware schedulers. The same store powers research and live trading.
Kernel-bypass networking, custom exchange adapters, low-latency OMS, and an independent risk system with hard per-strategy limits and global kill-switches.
Every order ever sent is captured bit-exactly. Production traffic can be replayed against a candidate strategy in milliseconds, so we ship with confidence.
§ 05 — Engineering
Our former Getco engineers designed the order path to be honest about every nanosecond. We make the trade-offs explicit: human-readable Rust for the slow path, hand-tuned C++ and FPGA offload for the hot loop.
Tick-to-trade
< 200 μs
internal hot path
OMS p99
< 1.2 ms
across regions
Replay throughput
> 1 Tb/h
bit-exact
Backtests / day
~ 4,000
vector + tick
Languages
Rust · C++ · Py
ad-hoc Mojo
Regions
HKG · TYO · NYC
active-active
§ 06 — Resilience
Every external interaction is captured with nanosecond timestamps. Yesterday's full trading day can be replayed against a code change in minutes.
Hong Kong is anchor, Tokyo and New York are mirror sites. Failover is exercised weekly during low-impact windows.
Risk core runs on its own infrastructure, its own deploy cadence, its own oncall. No strategy ever bypasses it.
Order lineage, model versions, data versions and parameter values are all written to an append-only ledger.