Liangjun (Lance) Song, PhD

Senior Data Scientist and ML Engineer. PhD in search and recommendation — production search ranking, experimentation, RAG and agentic systems.

View on GitHub

Melbourne, Australia / Australian Permanent Resident

Liangjun (Lance) Song, PhD

Senior Data Scientist & ML Engineer · PhD in Search & Recommendation Systems. I build production AI across the stack: search ranking, retrieval pipelines, experimentation and measurement, content understanding, RAG, and agentic automation — and I explain it to the people who have to use it.

CV · Data Scientist CV · ML Engineer Projects Dashboard JD Fit Checker Email Lance LinkedIn GitHub

6+ years Data science, ML, search and recommendation in production

100M+ events Feature pipelines and ML infrastructure designed at Redbubble

+10% CTR Search and recommendation lift from vector search and MLOps, measured by A/B test

18 systems Shipped projects and case studies across DS, AI systems, and client work

What I Build

My background starts in classical IR and ML — PhD-level research on personalised ranking and search, then four years as a Data Scientist at Redbubble building search, recommendation, and content-understanding systems over 100M+ user events. It runs through to the modern AI layer: LLM orchestration, RAG, agentic workflows, evaluation engineering, and CI/CD-grade deployment.

I'm strongest where retrieval quality, ML system design, and data engineering matter together: search that has to be measurable, recommendations that have to survive a real catalog, and agents that have to work reliably across messy enterprise data. The other half of the job is communication — I've owned experiment readouts for product stakeholders, taught 12+ lectures to a bootcamp cohort, and run onsite discovery with clinicians and contract reviewers who don't speak engineering.

Search Recommendation Ranking Vector Search A/B Testing Experiment Design RAG LangGraph FastAPI Python SQL Pandas · NumPy Scikit-learn PyTorch BigQuery Azure OpenAI GCP Vertex AI Docker MLOps

Data Science Deep Dive

The work I go deepest on: search ranking and retrieval, the experimentation foundation that proves a model moved something, and content understanding at catalogue scale. These are employer systems rather than public repos — each case study opens with its system map, experiment design, and measured impact.

Data Science · Search & Recommendation

Search Ranking & Vector Retrieval

Redbubble · Mar 2023 – Jan 2025

+10%CTR +0.5%Add-to-cart rate 100M+User events modelled

Semantic retrieval and ranking for a marketplace with 100M+ user events

Marqo · GCP Vertex AI · BigQuery · Python · Vector Search · MLOps · Offline Evaluation · A/B Testing

Open case study →

Data Science · Experimentation

Experimentation & Metrics Foundation

Redbubble · Jan 2021 – Jan 2025

Making A/B results trustworthy — GA4 to BigQuery, offline eval, ship gates

A/B Testing · GA4 · BigQuery · Experiment Design · Metric Design · Offline Evaluation · Python · SQL

Open case study →

Data Science · Content AI

Content Understanding & Moderation

Redbubble · Jan 2021 – Mar 2023

ReducedIP moderation risk ImprovedOperational throughput

Language-image models for classification, moderation and duplicate detection

Language-Image Models · Image Embeddings · Classification · Anomaly Detection · Python · Taxonomy Design · SEO

Open case study →

Featured Systems

Applied AI products, data-science case studies, agentic developer tooling, and client engagements — original builds, collaborations, and research forks kept clearly apart.

Open Projects Dashboard →

Healthcare AI · Contract Intelligence

Termly

Document AI for Australian Healthcare Contracts

Python · FastAPI · Pydantic · Claude Sonnet · Azure Doc Intelligence · Tesseract OCR · Docker

Details → Repository

Forward-Deployed · Client Engagement

Ultra OT

Occupational therapy practice · 2026

Operations system for an occupational therapy practice — built onsite

Next.js · TypeScript · Supabase · PostgreSQL · Row-Level Security · Google Routes API · Geospatial Clustering

Details →Client engagement — private codebase

Voice AI · Collaboration

CallForMe

AI phone assistant — transcripts, summaries, risk analysis, hold-for-me

Next.js · TypeScript · WebSocket · Realtime Audio · LLM Summarisation · PostgreSQL

Details →Collaboration with a partner developer — private codebase

Data Science · Search & Recommendation

Search Ranking & Vector Retrieval

Redbubble · Mar 2023 – Jan 2025

Semantic retrieval and ranking for a marketplace with 100M+ user events

Marqo · GCP Vertex AI · BigQuery · Python · Vector Search · MLOps · Offline Evaluation · A/B Testing

Details →Internal production work at Redbubble — no public repo

Agentic Developer Tooling

Agentic Workbench

Remote orchestration + multi-agent control plane for AI coding agents

React · TypeScript · Express · Socket.IO · xterm.js · SQLite · PM2 · Git Worktrees · PTY

Details → Remote Agent AgentForge

Teaching · Curriculum Design · LangGraph

Dispatch.AI

Australia IT Group · Feb – May 2026

6-week AI engineering course project — curriculum design + hands-on instruction

Python · FastAPI · LangGraph · Groq · Pydantic · Redis · MCP · NeMo Guardrails · RAGAS · Docker · Render

Details → Repository

Communication & Enablement

Most of what I build has to be handed to someone who didn't build it. That has meant teaching a 13-student cohort to ship a deployed multi-agent system, running 12+ hands-on lectures for engineers moving into AI, introducing agentic workflows to accounting professionals with no engineering background, and scoping two client systems from onsite interviews with clinicians and contract reviewers. Translating a technical system into the language of the people who own the problem is a skill I practise deliberately, not a side effect.

Teaching · Curriculum Design · LangGraph

Dispatch.AI

13 students · Feb – May 2026

6-week AI engineering course project — curriculum design + hands-on instruction

Audience — Career-change and early-career engineers with no prior agentic-AI experience

Details → Repository

Teaching · Applied AI Engineering

AI Engineer Bootcamp

28+ contact hours · Feb – May 2026

Guest Instructor at Australia IT Group — RAG, Agents, Fine-tuning

Audience — Working engineers moving into AI engineering — mixed backgrounds, no shared ML baseline

Details → Repository

Workshop · Non-technical Audience

ANZCC AI Accounting Workshop

3 · Jun 2026

Agentic AI for accounting professionals — a non-technical audience

Audience — Accounting professionals — domain experts with no engineering background

Details → Open the deck

Discovery · Stakeholder Communication

Forward-Deployed Discovery

2 client engagements in 2026 · 2026

How I scope a client system before writing any of it

Audience — Non-technical domain experts — clinicians, practice admin, contract reviewers

Details →

Work Experience

Apr 2025 - Jun 2026

WiseTech Global - AI Engineer, AI/ML Group

AI Engineer on CWBot and TriageAgent. Architected stateful multi-turn LLM agents with LangGraph and MCP, designed RAG workflows over enterprise knowledge bases, implemented guardrails and evaluation pipelines, instrumented request tracing and token/cost accounting, and owned CI/CD and deployment standards for production AI services.

2026

Ultra OT & Termly - Lead Engineer, onsite client engagements

Two forward-deployed engagements as sole engineer. Ran discovery with clinicians and admin staff before scoping an occupational therapy practice's operations system, including route-aware scheduling via geospatial clustering. Delivered document AI for Australian medical contract automation in a regulated healthcare setting, designed so output was checkable rather than trusted blindly.

Feb 2025 - Present

SGLang Framework - Open Source Contributor

Contribute runtime and serving optimizations for LLM inference backends used by agentic systems, spanning backend runtime, distributed serving, and prompt-language features for models including DeepSeek R1, Llama 3 and Qwen.

Feb 2026 - Jun 2026

Australia IT Group & ANZCC - Guest Instructor, Workshop Facilitator

12+ bootcamp sessions on production RAG, vector search, multi-agent systems, QLoRA fine-tuning and RAGAS evaluation. Designed Dispatch.AI, a 6-week project taking 13 students to a deployed AI booking assistant. Delivered a hands-on agentic AI workshop for accounting professionals, adapting technical content for a non-technical business audience.

Mar 2023 - Jan 2025

Redbubble - Data Scientist, Search & Recommendation

Led search and recommendation enhancements with Marqo vector search and GCP Vertex AI MLOps pipelines, improving add-to-cart by 0.5% and CTR by 10%. Owned production ML workflows end to end — offline evaluation, experiment design, deployment coordination and post-launch metric review — communicating results and model logic to product stakeholders. Designed ML infrastructure over 100M+ user events and drove the GA4 to BigQuery analytics migration underpinning A/B testing.

Jan 2021 - Mar 2023

Redbubble - Data Scientist, Content & Discovery

Shipped production content-classification and moderation systems using language-image models, reducing IP moderation risk and improving operational throughput. Built image duplicate-detection pipelines and data-quality/anomaly-detection workflows, and ran tagging, taxonomy and SEO experiments to improve product discovery.

Jul 2012 - Jun 2013

Microsoft Research Asia - Research Intern

Web Search and Data Mining Group. Developed Autosub, a collaborative subtitle-generation system using speech-to-text APIs, and contributed to web page mining projects using data mining and time-series analysis.

Research, Education, Awards

Education

PhD in Computer Science — Search & Recommendation Systems, RMIT University. B.Sc. in Computer Science, Harbin Institute of Technology.

Publications

VLDB Journal, DASFAA, ADC. Research focus: personalized ranking, preference adjustment, and continuous summarization.

Awards

ADC Best Student Paper Award, DASFAA Best Student Paper Runner Up, Google Code Jam Top 1000, ACM-ICPC Asia Regional Silver Medal.

Certifications

Google Cloud Professional Data Engineer, Microsoft Azure AI Fundamentals, GCP data lake / warehouse modernization, deep learning and NLP specializations.

Hiring Signal

My strongest fit is where search, retrieval, and ML engineering meet production AI: meaningful evaluation metrics, large or noisy catalogs, data pipelines that need to be reliable, and systems that need to improve measurably over time — not just demo well once. PhD-trained in search and recommendation, hands-on in data science and MLOps, now building LLM-powered systems end-to-end — and able to explain any of it to the people who have to live with it.

Start a conversation