Head of Quants · AI-Native HFT · Institutional Trading Systems Architect

PAK MING CHEUNG

The rare engineer who builds institutional-grade HFT infrastructure and production LLM runtimes from the same keyboard.

MSc Computer Science, AI & Data Science (Merit) · Scaled capital 20x+ with ~100% annualised returns · Architect of Qorinix LLM, our in-house model delivering 5x to 60x faster TTFT than frontier LLMs.

  • 10+ years quantitative trading across Forex · CFDs · Indices · Crypto
  • Sub-100ms ML inference · multi-provider LLM routing · audit-grade decision loop
  • SFC HK Papers 1/7/8/12 · Harvard CS50x · Google MLOps · Gemini Certified Educator
Pak Ming Cheung

Professional Summary

Head of Quants at Quant Sigma (London), running institutional-grade HFT operations across Forex, CFDs, indices and cryptocurrency. MSc Computer Science (AI & Data Science, Merit) from the University of Wolverhampton, with a dissertation on LLM-Augmented High-Frequency Trading, fusing transformer-based market understanding with sub-millisecond execution.

Proven track record building and operating AI-native trading infrastructure: multi-provider LLM routing across OpenAI, Anthropic Claude, Google Gemini, DeepSeek, Qwen, OpenRouter, NVIDIA and Cloudflare; benchmarked with TTFT p50/p95 analytics for latency-critical alpha research; production RAG systems with policy guardrails and human-review escalation for regulated decision flows; distributed ML inference pipelines serving millions of predictions daily at sub-100ms latency.

Led the design, training and deployment of our in-house Qorinix LLM, a domain-tuned language model engineered for trading-grade decision support. Qorinix delivers 5x to 60x faster TTFT and response latency than mainstream frontier LLMs on comparable workloads, enabling sub-millisecond alpha annotation, guardrailed reasoning and evidence-packed signal generation fully inside the firm's own runtime.

Over a decade of systematic trading, scaled capital 20x+ with ~100% annualised returns through disciplined volatility-based sizing and cross-asset arbitrage. SFC HK Papers 1/7/8/12, Harvard CS50x, Google MLOps and Gemini Certified Educator. The rare engineer who builds institutional-grade HFT infrastructure and production LLM runtimes from the same keyboard.

Core Competencies

LLM & Generative AI Platforms

Qorinix in-house LLM, Multi-Provider Routing, RAG, Guardrails, Audit Ledger

AI Platform Architecture

Prompt Registry, Policy Engine, Human-Review Gate, TTFT p50/p95 Analytics

Quant & HFT Leadership

Team building, risk governance, volatility sizing, 20x+ capital growth

HFT Infrastructure

FIX/iLink, CME Aurora, Equinix LD4/LD6, kernel-bypass, sub-ms hot path

ML & Deep Learning

Transformer alpha, LSTM/GRU, XGBoost ensembles, RL execution, MLOps

Crypto & DeFi

Sharpe 2-6, cross-chain arbitrage, on-chain features, early BTC since 2013

Notable Achievements

Systematic Trading (2013 – Present)

Scaled trading capital 20x+ over a decade of disciplined systematic trading. ~100% annualised returns with controlled drawdowns via volatility-based sizing. Forex, CFDs, indices, crypto, derivatives. Early Bitcoin investor since $80 in 2013.

Qorinix LLM, In-House Frontier Model

Led end-to-end design, training and deployment of Qorinix LLM, our domain-tuned trading-grade language model. Delivers 5x to 60x faster TTFT and response latency versus mainstream frontier LLMs on comparable workloads, running fully inside the firm's perimeter.

LLM Infrastructure & AI Platforms

Built LLM Arena, multi-provider fast-inference benchmark with TTFT p50/p95 analytics across OpenAI, Anthropic, Gemini, DeepSeek, Qwen, OpenRouter, NVIDIA and Cloudflare. Shipped audit-grade AI decision runtime with policy engine, guardrails and human-review gate.

Sub-100ms Distributed Inference

Architected distributed ML inference pipelines serving 5M+ daily predictions at <100ms latency. Shard-aware routing, warm-pool autoscaling, ONNX + fp16 quantisation, multi-region active-active with health-weighted failover.

Technical Leadership & Team Building

Led tech transformation from traditional to AI-augmented trading. Mentored 6+ engineers across multi-cloud (AWS · GCP · Cloudflare) with finance-compliance rigour. Established MLOps discipline and model-governance framework firm-wide.

Research, Publications & Credentials

MSc dissertation: LLM-Augmented High-Frequency Trading, 65% Sharpe-ratio improvement over baseline. SFC HK Papers 1/7/8/12. Harvard CS50x. Google MLOps Specialization. Gemini Certified Educator (2025-2028).

Risk Governance & Compliance

Drawdown kill-switches, volatility targeting, VaR/ES by strategy with margin projection. Tamper-evident audit trail with content-addressed evidence packs. Human-in-the-loop escalation for regulated decisions. Investment-grade reporting for family-office mandates.

Crypto & Web3 Pioneer

Early Bitcoin investor since $80 in 2013. Sharpe 2-6 in crypto markets. DeFi protocols, yield farming, liquidity provision, cross-chain arbitrage. Deep expertise across CEX/DEX venues and on-chain analytics (Glassnode).

Professional Experience

Head of Quants · AI-Native HFT & Systems Architect | Quant Sigma, London (London, UK)
Dec 2023 - Present

Responsibilities:

  • Lead quantitative research, AI-powered trading systems and execution infrastructure for institutional-grade HFT across traditional and crypto markets
  • Multi-signal HFT models: trend following, mean reversion, volatility breakout, queue-based market making, XGBoost/LightGBM ensembles for microstructure signal blending
  • Implemented transformer-based predictive models for market-direction forecasting, delivering material Sharpe-ratio uplift over baseline
  • Reinforcement-learning execution for adaptive position sizing and inventory control; volatility-based sizing with regime switching and scenario stress tests
  • Low-latency execution with millisecond performance; colocation at Equinix LD4/LD6 and AWS London region; real-time WebSocket market-data pipeline with deterministic ingestion
  • Event-driven order-book reconstruction with lock-free ring buffers and kernel-bypass networking; sub-microsecond jitter budget on the hot path
  • Built multi-provider LLM routing layer across OpenAI, Anthropic, Google Gemini, DeepSeek, Qwen, OpenRouter, NVIDIA and Cloudflare, with cost/latency/risk policy and automatic fail-over
  • Designed prompt registry with version pinning, policy prompts, evidence-pack assembly, and a guardrail & rules engine with human-review escalation for regulated decisions
  • Led end-to-end design, training and deployment of Qorinix LLM, in-house trading-tuned language model delivering 5x to 60x faster TTFT than frontier LLMs on comparable workloads
  • Custom inference kernels, speculative decoding and trading-corpus fine-tuning; sub-millisecond alpha annotation and evidence-packed signal generation fully inside the firm's runtime

Key Achievements:

  • ~100% annualised returns via volatility-based sizing and scenario stress tests
  • Distributed ML inference pipeline delivering <100ms latency for real-time signal generation
  • LLM Arena multi-provider fast-inference benchmark with TTFT p50/p95 analytics
  • Audit ledger tying every AI call to workflow, policy, cost and support lineage
  • NLP sentiment engine ingesting 500K+ articles, social and alt-data daily
  • Qorinix LLM, 5x to 60x faster TTFT vs mainstream frontier LLMs on comparable trading workloads
  • MLOps discipline: model registry, evaluation suites, canary deployment, drift monitoring, per-strategy cost ledger

Skills & Technologies:

HFT
Qorinix LLM
LLM Routing
RAG
Transformer Alpha
Python
C++
PyTorch
TensorFlow
FastAPI
PostgreSQL
TimescaleDB
Redis
Docker
AWS
QuantConnect
MT5
CCXT
FIX
iLink
CME Aurora
Equinix LD4/LD6
Colocation
Kernel Bypass
Reinforcement Learning
Backtesting
Risk Management
Volatility Sizing
OpenTelemetry
LangChain
Policy Engine
Guardrails
Audit Ledger
Lead AI Engineer & Technical Architect · Quant & Algorithmic Trader | Pacific Cloud Computing Ltd. (Hong Kong & Remote UK)
Jan 2015 - Dec 2024

Responsibilities:

  • Dual-track: spearheaded AI transformation (Dec 2021 - Dec 2024) establishing the firm's AI/ML practice and intelligent systems processing millions of predictions daily
  • Designed and deployed distributed ML inference system achieving <100ms latency, serving 5M+ requests/day
  • Built comprehensive RAG system (LangChain + vector DB) reducing information retrieval time by 85% while maintaining 94% accuracy
  • Developed end-to-end MLOps pipeline with automated retraining and A/B testing, improving model performance 40% Q-over-Q
  • Implemented transformer-based sentiment analysis processing 500K+ documents daily for market intelligence
  • Established AI/ML best practices, conducted architecture reviews and mentored a team of 6 engineers
  • Ran personal systematic-trading programme throughout full tenure generating institutional-grade returns
  • Python-based predictive models and portfolio optimisation with rigorous volatility-based risk sizing
  • Enterprise platforms 2015-2021: multi-tenant SaaS, 100+ clients, 1M+ daily API requests, 99.9% uptime
  • Analytics Dashboard Platform: real-time visualisation, 10GB+ daily data, 60% reduction in report generation time

Key Achievements:

  • Sharpe 1.5+ (Forex/CFDs) and 2-6 (crypto); 20x+ capital growth
  • Pioneered blockchain/cryptocurrency expertise: DeFi protocols, yield farming, liquidity provision, cross-chain arbitrage
  • Authored investment-grade quantitative performance reports for family-office investments
  • Shipped multi-tenant SaaS serving 100+ clients at 99.9% uptime
  • E-commerce Integration Suite: multi-platform API layer with 5+ payment-gateway integrations

Skills & Technologies:

Python
PyTorch
LangChain
Pinecone
FastAPI
React
Node.js
MongoDB
PostgreSQL
Docker
AWS
MT4/MT5
Binance API
Kraken API
Glassnode
DeFi
NFTs
Web3
Yield Farming
MLOps
RAG
A/B Testing
Transformer Models
Senior Product Manager (Technical) / Web Manager | Groupon.com (Hong Kong)
Apr 2013 - Dec 2014

Responsibilities:

  • Led strategic account management and coordinated with international merchants for brand positioning
  • Developed and maintained project plans, including cost estimation and resource allocation
  • Oversaw inventory management, supply chain coordination, and pricing structure
  • Conducted comprehensive market analysis to support business operations

Key Achievements:

  • 35% revenue growth through strategic pricing and positioning
  • 20% operational cost reduction via process optimisation

Skills & Technologies:

Product Management
Strategic Planning
Market Analysis
Operations
E-commerce
International Business
Brand Positioning
Merchant Relations
Pricing Strategy
Senior Operations Manager | SoManyCall Telecom (Hong Kong)
Mar 2008 - Apr 2013

Responsibilities:

  • Oversaw strategic and operational aspects within the telecommunications software sector
  • Led multi-disciplinary team, implementing development programs
  • Managed entire project lifecycle for custom software solutions
  • Improved operational workflow and service quality by standardising procedures

Key Achievements:

  • 25% annual growth rate through strategic leadership
  • 30% reduction in customer churn

Skills & Technologies:

Operations Management
Team Leadership
Project Management
Client Relations
Telecommunications
Software Solutions
Quant Sigma · AI-Native HFT

Flagship Projects & Research

A curated view of the AI-native trading systems, LLM infrastructure and quant research I architect at Quant Sigma.

Qorinix LLM, In-House Trading-Tuned Frontier Model
Expand
LLM Infra

Qorinix LLM, In-House Trading-Tuned Frontier Model

End-to-end design, training and deployment of Qorinix LLM, multi-billion-token trading-intelligence corpus, mixture-of-experts routing with alpha-reasoning/risk-critique/compliance heads, custom inference kernels and speculative decoding. 5x to 60x faster TTFT than mainstream frontier LLMs on comparable workloads, running fully inside the firm's perimeter.

LLM-Augmented High-Frequency Trading System
Expand
MSc Research

LLM-Augmented High-Frequency Trading System

MSc dissertation: novel architecture fusing LLM narrative understanding with real-time HFT across Forex, CFDs, indices and crypto. Custom transformer with attention over microstructure features (order-book depth, trade-flow imbalance, spread volatility), ensemble gating and latency-aware inference scheduling. 65% Sharpe-ratio uplift on 5-year out-of-sample evaluation.

LLM Arena, Multi-Provider Fast-Inference Benchmark
Expand
LLM Infra

LLM Arena, Multi-Provider Fast-Inference Benchmark

Side-by-side real-time benchmark across OpenAI, Anthropic Claude, Google Gemini, DeepSeek, Qwen, OpenRouter, NVIDIA and Cloudflare, streaming token-by-token with live comparison charts. Tracks TTFT, total time, tokens/sec with p50/p95 analytics, model-cost ledger and per-prompt leaderboard. Edge deployment on Cloudflare Pages; Workers-backed API proxy with per-user key isolation.

Audit-Grade AI Decision Loop
Expand
Governance

Audit-Grade AI Decision Loop

Closed-loop flow: policy check → evidence-pack retrieval → prompt control → model router → action runtime → audit ledger → policy feedback. Every call versioned, costed, support-traceable and replayable. Human-in-the-loop escalation for regulated decisions, tamper-evident audit trail with content-addressed evidence packs, and per-strategy cost attribution.

Sub-100ms Inference Mesh
Expand
ML Infra

Sub-100ms Inference Mesh

Distributed ML inference pipeline serving millions of predictions/day at <100ms TTFT across signal generation, sentiment scoring and anomaly detection. Shard-aware request routing, warm-pool autoscaling, ONNX-compiled models with fp16 quantisation, cache-through Redis tiering. Multi-region active-active with health-weighted failover.

Real-Time Risk & P&L Dashboard
Expand
Private

Real-Time Risk & P&L Dashboard

Live portfolio monitoring with volatility targeting, drawdown kill-switches, per-strategy P&L attribution and cost-ledger reconciliation. Tick-level exposure dashboard across venues, automated breach alerts, scenario stress tests and VaR/ES by strategy with margin projection. Heatmaps for correlation drift and regime-change signals.

Unified LLM × HFT Tech Stack
Expand
Quant

Unified LLM × HFT Tech Stack

Hybrid stack blending Python, C++, PyTorch, TensorFlow, FastAPI, PostgreSQL/TimescaleDB, Redis, Docker, AWS with QuantConnect, MT5, CCXT, FIX, Qorinix LLM, LangChain and OpenTelemetry, the engineering foundation behind every production trading and AI pipeline at Quant Sigma.

Achievements KPI Dashboard
Expand
Quant

Achievements KPI Dashboard

Institutional-grade quant performance dashboard: 20x+ capital growth, ~100% annualised returns, 65% Sharpe-ratio uplift on LLM-HFT research, Sharpe 2-6 in crypto markets, 5M+ daily predictions at <100ms, and 500K+ documents/day processed in the sentiment engine.

Skills & Expertise

Programming & Frameworks

Python (NumPy, Pandas, scikit-learn)92%
TypeScript / JavaScript90%
React 18/19 / Next.js 1590%
C++ (HFT optimisation)80%
C# / Unity 202278%
MQL5 (MT4/MT5)95%
HTML/CSS / Tailwind / shadcn/ui90%
SQL (PostgreSQL, TimescaleDB)85%

AI & Machine Learning

Qorinix LLM (In-House Frontier Model)95%
Multi-Provider LLM Routing & Fallback94%
LLM Integration (GPT-5, Claude, Gemini, DeepSeek, Qwen)92%
LangChain / RAG / Vector DBs (Pinecone, Weaviate, pgvector)90%
Deep Learning (LSTM, Transformers, PyTorch)88%
NLP & Sentiment Analysis (500K+ docs/day)88%
XGBoost / LightGBM / Random Forest88%
MLOps, Canary Deployment, Drift Monitoring85%
Reinforcement Learning (Execution Optimisation)82%

Trading & Finance

Algorithmic Trading & Strategy95%
Quantitative Analysis92%
Risk Management / Volatility Sizing90%
Portfolio Optimisation88%
Market Microstructure88%
Derivatives (Options, Futures, Perps)85%

HFT & Execution

Low-Latency Execution (sub-ms hot path)92%
FIX API / iLink Trading90%
Colocation (Equinix LD4/LD6, CME Aurora)88%
Kernel-Bypass Networking / Lock-Free Ring Buffers85%
Market Making / Queue-based85%
Statistical Arbitrage / Transformer Alpha88%
WebSocket / Event-driven Architecture90%

Backend & Databases

FastAPI / Django / Flask / Node.js90%
PostgreSQL / MongoDB / Redis88%
TimescaleDB / InfluxDB (Time Series)85%
Supabase / Firebase / Stripe85%
React Native / Expo SDK 5282%
Recharts / D3.js / Data Viz80%

Tools & Infrastructure

AWS (EC2, Lambda, S3, RDS)85%
Docker / Kubernetes / CI/CD82%
Git / GitHub / pnpm92%
Vercel / Amplify / Netlify88%
Google Cloud / Azure78%
Jupyter Notebook / Google Colab88%

Technical & Quant Skills

Backtesting & Walk-forward Optimisation95%
Statistical Modelling / Monte Carlo92%
Time Series Analysis / Forecasting90%
Data Pipeline Engineering (Airflow, ETL)85%
QuantConnect / TradingView / CCXT92%
Glassnode / Tick Data Suite88%

Blockchain & Crypto

Bitcoin / Ethereum / DeFi Protocols88%
Cross-exchange Arbitrage / Funding Rate85%
On-Chain Analytics / Whale Tracking82%
Binance / Bybit / Kraken / Gate.io APIs88%
Solidity (Basic) / ERC-721 / ERC-115570%

Education & Certifications

Education

MSc in Artificial Intelligence (Merit)
2023 - 2025

AI & Data Science for Quantitative Finance

University of Wolverhampton

Visit Website

Graduated with Merit. Research in LSTM neural networks for financial forecasting, NLP sentiment analysis, and machine learning applications in trading systems.

Completed
Bachelor of Business Administration (BBA)
1995

Marketing

The Hong Kong University of Science & Technology

Visit Website

Foundation in business principles and marketing strategies.

Completed
Hong Kong Advanced Level Examination (HK A-Level)
1995

Hong Kong Education Bureau

Visit Website
Completed
Hong Kong Certificate of Education Examination (HKCEE)
1993

Hong Kong Education Bureau

Visit Website
Completed

Certifications & Qualifications

Licensing Examination for Securities and Futures Intermediaries (LE)
2021

The Securities and Futures Commission of Hong Kong

Visit Website

Passed Papers 1, 7, 8, 12

Insurance Intermediaries Qualifying Examination (IIQE)
2013

Insurance Authority of Hong Kong

Visit Website

Qualified in Papers I, III, & IV

Harvard CS50x
2023

Harvard University

Visit Website

Introduction to Computer Science

AWS Certified Solutions Architect Associate

Amazon Web Services

Visit Website
In Progress
AWS Certified Developer Associate

Amazon Web Services

Visit Website
In Progress

Frequently Asked Questions

Common questions about my experience and expertise

Resources & Publications

Systematic Trading Approaches for Digital Asset Markets
Research Paper2023
Author: Pak Ming Cheung
Pacific Cloud Computing Limited

A research paper exploring the implementation and effectiveness of algorithmic trading strategies in cryptocurrency markets. This study examines various technical indicators, market microstructure, and execution algorithms.

Predictive Analytics and AI Methods for Digital Asset Price Forecasting
Research Paper2020
Author: Pak Ming Cheung
Pacific Cloud Computing Limited

A comprehensive study on applying various machine learning techniques to predict cryptocurrency market movements. This research examines the effectiveness of different ML models in capturing market patterns and generating actionable trading signals.

Time-Scale Integration in Modern Automated Trading Systems
Research Paper2022
Author: Pak Ming Cheung
Pacific Cloud Computing Limited

An in-depth analysis of multi-frequency trading strategies and their application in modern financial markets, focusing on the integration of different timeframes and market dynamics.

Transaction Efficiency Enhancement in Distributed Financial Networks
Research Paper2021
Author: Pak Ming Cheung
Pacific Cloud Computing Limited

A detailed examination of trade execution challenges in DeFi markets and proposed solutions, including gas optimization, MEV protection, and cross-chain bridging strategies.

Schedule an Interview
Book a time slot that works for you using my Calendly scheduling system.

Contact Me

Email

Use contact form only

Phone

07920800830

Location

Tilehurst, Reading, UK