Head of Quants · AI-Native HFT · Institutional Trading Systems Architect
PAK MING CHEUNG
The rare engineer who builds institutional-grade HFT infrastructure and production LLM runtimes from the same keyboard.
MSc Computer Science, AI & Data Science (Merit) · Scaled capital 20x+ with ~100% annualised returns · Architect of Qorinix LLM, our in-house model delivering 5x to 60x faster TTFT than frontier LLMs.
- 10+ years quantitative trading across Forex · CFDs · Indices · Crypto
- Sub-100ms ML inference · multi-provider LLM routing · audit-grade decision loop
- SFC HK Papers 1/7/8/12 · Harvard CS50x · Google MLOps · Gemini Certified Educator
Professional Summary
Head of Quants at Quant Sigma (London), running institutional-grade HFT operations across Forex, CFDs, indices and cryptocurrency. MSc Computer Science (AI & Data Science, Merit) from the University of Wolverhampton, with a dissertation on LLM-Augmented High-Frequency Trading, fusing transformer-based market understanding with sub-millisecond execution.
Proven track record building and operating AI-native trading infrastructure: multi-provider LLM routing across OpenAI, Anthropic Claude, Google Gemini, DeepSeek, Qwen, OpenRouter, NVIDIA and Cloudflare; benchmarked with TTFT p50/p95 analytics for latency-critical alpha research; production RAG systems with policy guardrails and human-review escalation for regulated decision flows; distributed ML inference pipelines serving millions of predictions daily at sub-100ms latency.
Led the design, training and deployment of our in-house Qorinix LLM, a domain-tuned language model engineered for trading-grade decision support. Qorinix delivers 5x to 60x faster TTFT and response latency than mainstream frontier LLMs on comparable workloads, enabling sub-millisecond alpha annotation, guardrailed reasoning and evidence-packed signal generation fully inside the firm's own runtime.
Over a decade of systematic trading, scaled capital 20x+ with ~100% annualised returns through disciplined volatility-based sizing and cross-asset arbitrage. SFC HK Papers 1/7/8/12, Harvard CS50x, Google MLOps and Gemini Certified Educator. The rare engineer who builds institutional-grade HFT infrastructure and production LLM runtimes from the same keyboard.
Core Competencies
LLM & Generative AI Platforms
Qorinix in-house LLM, Multi-Provider Routing, RAG, Guardrails, Audit Ledger
AI Platform Architecture
Prompt Registry, Policy Engine, Human-Review Gate, TTFT p50/p95 Analytics
Quant & HFT Leadership
Team building, risk governance, volatility sizing, 20x+ capital growth
HFT Infrastructure
FIX/iLink, CME Aurora, Equinix LD4/LD6, kernel-bypass, sub-ms hot path
ML & Deep Learning
Transformer alpha, LSTM/GRU, XGBoost ensembles, RL execution, MLOps
Crypto & DeFi
Sharpe 2-6, cross-chain arbitrage, on-chain features, early BTC since 2013
Notable Achievements
Systematic Trading (2013 – Present)
Scaled trading capital 20x+ over a decade of disciplined systematic trading. ~100% annualised returns with controlled drawdowns via volatility-based sizing. Forex, CFDs, indices, crypto, derivatives. Early Bitcoin investor since $80 in 2013.
Qorinix LLM, In-House Frontier Model
Led end-to-end design, training and deployment of Qorinix LLM, our domain-tuned trading-grade language model. Delivers 5x to 60x faster TTFT and response latency versus mainstream frontier LLMs on comparable workloads, running fully inside the firm's perimeter.
LLM Infrastructure & AI Platforms
Built LLM Arena, multi-provider fast-inference benchmark with TTFT p50/p95 analytics across OpenAI, Anthropic, Gemini, DeepSeek, Qwen, OpenRouter, NVIDIA and Cloudflare. Shipped audit-grade AI decision runtime with policy engine, guardrails and human-review gate.
Sub-100ms Distributed Inference
Architected distributed ML inference pipelines serving 5M+ daily predictions at <100ms latency. Shard-aware routing, warm-pool autoscaling, ONNX + fp16 quantisation, multi-region active-active with health-weighted failover.
Technical Leadership & Team Building
Led tech transformation from traditional to AI-augmented trading. Mentored 6+ engineers across multi-cloud (AWS · GCP · Cloudflare) with finance-compliance rigour. Established MLOps discipline and model-governance framework firm-wide.
Research, Publications & Credentials
MSc dissertation: LLM-Augmented High-Frequency Trading, 65% Sharpe-ratio improvement over baseline. SFC HK Papers 1/7/8/12. Harvard CS50x. Google MLOps Specialization. Gemini Certified Educator (2025-2028).
Risk Governance & Compliance
Drawdown kill-switches, volatility targeting, VaR/ES by strategy with margin projection. Tamper-evident audit trail with content-addressed evidence packs. Human-in-the-loop escalation for regulated decisions. Investment-grade reporting for family-office mandates.
Crypto & Web3 Pioneer
Early Bitcoin investor since $80 in 2013. Sharpe 2-6 in crypto markets. DeFi protocols, yield farming, liquidity provision, cross-chain arbitrage. Deep expertise across CEX/DEX venues and on-chain analytics (Glassnode).
Professional Experience
Responsibilities:
- Lead quantitative research, AI-powered trading systems and execution infrastructure for institutional-grade HFT across traditional and crypto markets
- Multi-signal HFT models: trend following, mean reversion, volatility breakout, queue-based market making, XGBoost/LightGBM ensembles for microstructure signal blending
- Implemented transformer-based predictive models for market-direction forecasting, delivering material Sharpe-ratio uplift over baseline
- Reinforcement-learning execution for adaptive position sizing and inventory control; volatility-based sizing with regime switching and scenario stress tests
- Low-latency execution with millisecond performance; colocation at Equinix LD4/LD6 and AWS London region; real-time WebSocket market-data pipeline with deterministic ingestion
- Event-driven order-book reconstruction with lock-free ring buffers and kernel-bypass networking; sub-microsecond jitter budget on the hot path
- Built multi-provider LLM routing layer across OpenAI, Anthropic, Google Gemini, DeepSeek, Qwen, OpenRouter, NVIDIA and Cloudflare, with cost/latency/risk policy and automatic fail-over
- Designed prompt registry with version pinning, policy prompts, evidence-pack assembly, and a guardrail & rules engine with human-review escalation for regulated decisions
- Led end-to-end design, training and deployment of Qorinix LLM, in-house trading-tuned language model delivering 5x to 60x faster TTFT than frontier LLMs on comparable workloads
- Custom inference kernels, speculative decoding and trading-corpus fine-tuning; sub-millisecond alpha annotation and evidence-packed signal generation fully inside the firm's runtime
Key Achievements:
- ~100% annualised returns via volatility-based sizing and scenario stress tests
- Distributed ML inference pipeline delivering <100ms latency for real-time signal generation
- LLM Arena multi-provider fast-inference benchmark with TTFT p50/p95 analytics
- Audit ledger tying every AI call to workflow, policy, cost and support lineage
- NLP sentiment engine ingesting 500K+ articles, social and alt-data daily
- Qorinix LLM, 5x to 60x faster TTFT vs mainstream frontier LLMs on comparable trading workloads
- MLOps discipline: model registry, evaluation suites, canary deployment, drift monitoring, per-strategy cost ledger
Skills & Technologies:
Responsibilities:
- Dual-track: spearheaded AI transformation (Dec 2021 - Dec 2024) establishing the firm's AI/ML practice and intelligent systems processing millions of predictions daily
- Designed and deployed distributed ML inference system achieving <100ms latency, serving 5M+ requests/day
- Built comprehensive RAG system (LangChain + vector DB) reducing information retrieval time by 85% while maintaining 94% accuracy
- Developed end-to-end MLOps pipeline with automated retraining and A/B testing, improving model performance 40% Q-over-Q
- Implemented transformer-based sentiment analysis processing 500K+ documents daily for market intelligence
- Established AI/ML best practices, conducted architecture reviews and mentored a team of 6 engineers
- Ran personal systematic-trading programme throughout full tenure generating institutional-grade returns
- Python-based predictive models and portfolio optimisation with rigorous volatility-based risk sizing
- Enterprise platforms 2015-2021: multi-tenant SaaS, 100+ clients, 1M+ daily API requests, 99.9% uptime
- Analytics Dashboard Platform: real-time visualisation, 10GB+ daily data, 60% reduction in report generation time
Key Achievements:
- Sharpe 1.5+ (Forex/CFDs) and 2-6 (crypto); 20x+ capital growth
- Pioneered blockchain/cryptocurrency expertise: DeFi protocols, yield farming, liquidity provision, cross-chain arbitrage
- Authored investment-grade quantitative performance reports for family-office investments
- Shipped multi-tenant SaaS serving 100+ clients at 99.9% uptime
- E-commerce Integration Suite: multi-platform API layer with 5+ payment-gateway integrations
Skills & Technologies:
Responsibilities:
- Led strategic account management and coordinated with international merchants for brand positioning
- Developed and maintained project plans, including cost estimation and resource allocation
- Oversaw inventory management, supply chain coordination, and pricing structure
- Conducted comprehensive market analysis to support business operations
Key Achievements:
- 35% revenue growth through strategic pricing and positioning
- 20% operational cost reduction via process optimisation
Skills & Technologies:
Responsibilities:
- Oversaw strategic and operational aspects within the telecommunications software sector
- Led multi-disciplinary team, implementing development programs
- Managed entire project lifecycle for custom software solutions
- Improved operational workflow and service quality by standardising procedures
Key Achievements:
- 25% annual growth rate through strategic leadership
- 30% reduction in customer churn
Skills & Technologies:
Flagship Projects & Research
A curated view of the AI-native trading systems, LLM infrastructure and quant research I architect at Quant Sigma.

Qorinix LLM, In-House Trading-Tuned Frontier Model
End-to-end design, training and deployment of Qorinix LLM, multi-billion-token trading-intelligence corpus, mixture-of-experts routing with alpha-reasoning/risk-critique/compliance heads, custom inference kernels and speculative decoding. 5x to 60x faster TTFT than mainstream frontier LLMs on comparable workloads, running fully inside the firm's perimeter.

LLM-Augmented High-Frequency Trading System
MSc dissertation: novel architecture fusing LLM narrative understanding with real-time HFT across Forex, CFDs, indices and crypto. Custom transformer with attention over microstructure features (order-book depth, trade-flow imbalance, spread volatility), ensemble gating and latency-aware inference scheduling. 65% Sharpe-ratio uplift on 5-year out-of-sample evaluation.

LLM Arena, Multi-Provider Fast-Inference Benchmark
Side-by-side real-time benchmark across OpenAI, Anthropic Claude, Google Gemini, DeepSeek, Qwen, OpenRouter, NVIDIA and Cloudflare, streaming token-by-token with live comparison charts. Tracks TTFT, total time, tokens/sec with p50/p95 analytics, model-cost ledger and per-prompt leaderboard. Edge deployment on Cloudflare Pages; Workers-backed API proxy with per-user key isolation.

Audit-Grade AI Decision Loop
Closed-loop flow: policy check → evidence-pack retrieval → prompt control → model router → action runtime → audit ledger → policy feedback. Every call versioned, costed, support-traceable and replayable. Human-in-the-loop escalation for regulated decisions, tamper-evident audit trail with content-addressed evidence packs, and per-strategy cost attribution.

Sub-100ms Inference Mesh
Distributed ML inference pipeline serving millions of predictions/day at <100ms TTFT across signal generation, sentiment scoring and anomaly detection. Shard-aware request routing, warm-pool autoscaling, ONNX-compiled models with fp16 quantisation, cache-through Redis tiering. Multi-region active-active with health-weighted failover.

Real-Time Risk & P&L Dashboard
Live portfolio monitoring with volatility targeting, drawdown kill-switches, per-strategy P&L attribution and cost-ledger reconciliation. Tick-level exposure dashboard across venues, automated breach alerts, scenario stress tests and VaR/ES by strategy with margin projection. Heatmaps for correlation drift and regime-change signals.

Unified LLM × HFT Tech Stack
Hybrid stack blending Python, C++, PyTorch, TensorFlow, FastAPI, PostgreSQL/TimescaleDB, Redis, Docker, AWS with QuantConnect, MT5, CCXT, FIX, Qorinix LLM, LangChain and OpenTelemetry, the engineering foundation behind every production trading and AI pipeline at Quant Sigma.

Achievements KPI Dashboard
Institutional-grade quant performance dashboard: 20x+ capital growth, ~100% annualised returns, 65% Sharpe-ratio uplift on LLM-HFT research, Sharpe 2-6 in crypto markets, 5M+ daily predictions at <100ms, and 500K+ documents/day processed in the sentiment engine.
Skills & Expertise
Programming & Frameworks
AI & Machine Learning
Trading & Finance
HFT & Execution
Backend & Databases
Tools & Infrastructure
Technical & Quant Skills
Blockchain & Crypto
Education & Certifications
Education
AI & Data Science for Quantitative Finance
University of Wolverhampton
Visit WebsiteGraduated with Merit. Research in LSTM neural networks for financial forecasting, NLP sentiment analysis, and machine learning applications in trading systems.
Marketing
The Hong Kong University of Science & Technology
Visit WebsiteFoundation in business principles and marketing strategies.
Hong Kong Education Bureau
Visit WebsiteHong Kong Education Bureau
Visit WebsiteCertifications & Qualifications
The Securities and Futures Commission of Hong Kong
Visit WebsitePassed Papers 1, 7, 8, 12
Insurance Authority of Hong Kong
Visit WebsiteQualified in Papers I, III, & IV
Harvard University
Visit WebsiteIntroduction to Computer Science
Amazon Web Services
Visit WebsiteAmazon Web Services
Visit WebsiteFrequently Asked Questions
Common questions about my experience and expertise
Resources & Publications
A research paper exploring the implementation and effectiveness of algorithmic trading strategies in cryptocurrency markets. This study examines various technical indicators, market microstructure, and execution algorithms.
A comprehensive study on applying various machine learning techniques to predict cryptocurrency market movements. This research examines the effectiveness of different ML models in capturing market patterns and generating actionable trading signals.
An in-depth analysis of multi-frequency trading strategies and their application in modern financial markets, focusing on the integration of different timeframes and market dynamics.
A detailed examination of trade execution challenges in DeFi markets and proposed solutions, including gas optimization, MEV protection, and cross-chain bridging strategies.
Contact Me
Use contact form only
Phone
07920800830
Location
Tilehurst, Reading, UK
