Melbourne, Australia / Australian Permanent Resident
Liangjun (Lance) Song, PhD
Senior ML Engineer · Data Scientist · PhD in Search & Recommendation Systems. I build production AI across the stack: search ranking, retrieval pipelines, content understanding, LLM-powered workflows, RAG, and agentic automation.
6+ years
AI, ML, search, recommendation, and production data systems
10+ GitHub systems
Agent workbenches, contract AI, multimodal RAG, sourcing tools, and AI education projects
100M+ events
Designed ML infrastructure and feature pipelines at Redbubble
10% CTR lift
Search and recommendation improvements using vector search and MLOps
What I Build
My background spans classical IR and ML — PhD-level research on personalised ranking and search, four years building search, recommendation, and content-understanding systems at Redbubble over 100 M+ user events — through to the modern AI layer: LLM orchestration, RAG, agentic workflows, evaluation engineering, and CI/CD-grade deployment.
I’m strongest where retrieval quality, ML system design, and data engineering matter together: search that has to be measurable, recommendations that have to survive a real catalog, and agents that have to work reliably across messy enterprise data.
Search
Recommendation
Ranking
Vector Search
A/B Testing
RAG
LangGraph
LangChain
FastAPI
Python
Pandas · NumPy
Scikit-learn
PyTorch
PostgreSQL
Azure OpenAI
GCP Vertex AI
Docker
MLOps
Featured Systems
My GitHub projects span applied ML and AI products, search and retrieval systems, agentic developer tooling, data science workflows, and human-facing AI applications. I separate original builds, collaborations, and research forks clearly.
Open Projects Dashboard →
GitHub project summary
Healthcare AI · Contract Intelligence
Termly
Australian medical contract automation: scanned PDF extraction, OCR correction, structured clause/entity extraction, Medicare/PBS validation, and risk assessment.
- FastAPI · Pydantic · Claude Sonnet · Azure Document Intelligence · Tesseract OCR · Docker
Repository
Agentic Developer Tooling
Agentic Workbench
Browser-based remote orchestration for single coding agents (Remote Agent Workbench) plus a multi-agent parallel control plane across isolated git worktrees (AgentForge) — covering the full range of AI coding-agent workflows.
- React · TypeScript · Express · Socket.IO · xterm.js · PTY · SQLite · PM2 · Git Worktrees
Workbench
AgentForge
AI Curriculum Design · Instructor
Dispatch.AI
6-week progressive course project I designed and instructed (13 students): build a production AI booking assistant from Pydantic state and Redis through LangGraph agents, MCP tools, multi-agent routing, NeMo Guardrails, and RAGAS evaluation to a Docker + Render capstone.
- FastAPI · LangGraph · Groq · MCP · NeMo Guardrails · RAGAS · Docker · Render
Repository
Multimodal RAG
Video-RAG
Visual retrieval pipeline: sample video frames → CLIP embeddings → ChromaDB → natural-language queries with timestamped evidence and multi-provider LLM answers.
Repository
Operator Automation
Toy Gifting System
Back-office sourcing tool: SQLite catalog ingestion, supplier/compliance metadata, operator feedback loop, bilingual product data, and Three.js 3D bin-packing visualisation.
Repository
Creative AI Workflow
BranchFlow
Nonlinear writing and prompt-orchestration workbench with branching story state, per-branch LLM prompt drawers, React Flow mind maps, and Obsidian Canvas export.
Repository
Work Experience
Apr 2025 - Present
WiseTech Global - AI Engineer, AI/ML Group
AI Engineer on CWBot and TriageAgent. Architect stateful multi-turn LLM agents with LangGraph and MCP, design RAG workflows over enterprise knowledge bases, implement guardrails and evaluation pipelines, and own CI/CD and deployment standards for production AI services.
Feb 2025 - Present
SGLang Framework - Open Source Contributor / Committer
Contribute runtime and serving optimizations for LLM backends used by agentic systems, including work around model execution reliability, prompt/runtime control, and scalable inference behavior.
Jan 2021 - Jan 2025
Redbubble - Data Scientist (Search & Recommendation · Content AI · Moderation)
Built and owned search ranking, recommendation, content-classification, and moderation ML systems. Delivered vector-search improvements with Marqo and GCP Vertex AI MLOps pipelines, lifting CTR by 10% and add-to-cart by 0.5%. Designed feature pipelines and ML infrastructure over 100M+ user events and led GA4 analytics migration.
Jul 2012 - Jun 2013
Microsoft Research Asia - Research Intern
Worked on web search, data mining, and Autosub, a collaborative subtitle-generation system using speech-to-text APIs and automated processing pipelines.
Research, Education, Awards
Education
PhD in Computer Science — Search & Recommendation Systems, RMIT University. B.Sc. in Computer Science, Harbin Institute of Technology.
Publications
VLDB Journal, DASFAA, ADC. Research focus: personalized ranking, preference adjustment, and continuous summarization.
Awards
ADC Best Student Paper Award, DASFAA Best Student Paper Runner Up, Google Code Jam Top 1000, ACM-ICPC Asia Regional Silver Medal.
Certifications
Google Cloud Professional Data Engineer, Microsoft Azure AI Fundamentals, GCP data lake / warehouse modernization, deep learning and NLP specializations.
Hiring Signal
My strongest fit is where search, retrieval, and ML engineering meet production AI: meaningful evaluation metrics, large or noisy catalogs, data pipelines that need to be reliable, and systems that need to improve measurably over time — not just demo well once. PhD-trained in search and recommendation, hands-on in data science and MLOps, now building LLM-powered systems end-to-end.
Start a conversation