Justin Barry

ML research engineer & applied scientist building intelligent systems

Former Applied Scientist @ Amazon • MS Computer Science • BS Math+CS

Machine Learning Applied Scientist and Research Engineer. Former Amazon ML Scientist (Prime Video).

I design and ship machine learning systems—generative and discriminative—across vision, language, and structured data. I own the math and the PyTorch. I build architectures from first principles when off-the-shelf fails, and I build agentic systems: multi-agent LLM pipelines that generate, critique, and rank.

My edge is messy problems. When baselines don't work and the objective isn't obvious, I turn ambiguity into a clear loss function and a system you can deploy.

I work remotely as an embedded research engineer—either as a full-time hire or via my LLC on longer-term engagements and scoped projects.

Available to hireBook intro callOpen to full-time and contract opportunities

Featured Projects

LLM Repo Agent: Bug-Fixing Code Agent

A deterministic agent system for automated bug fixing. The Python driver owns all control flow—iteration limits, tool dispatch, test execution, reflection triggers—while the LLM handles reasoning within strict boundaries.

Key engineering choices: JSON tool protocol with schema validation, multi-turn chat ChatCompletion LLM Adaptors for tool-call history reference, sandboxed repo copies for safe execution, Reflexion to correct driver loop events and failed tests, and multithreaded evaluation with Monte Carlo rollouts across task suites.

Fine-tuning pipeline: Teacher traces from GPT-4o generate step-level SFT data, followed by DPO preference pairs from pass/fail rollouts. The goal is distilling reliable tool-calling behavior into smaller open-weight models like Qwen2.5-7B.

V1 Design V2 Changes GitHub

Joint Image+Text Rectified Flow Training Loop

Building a joint image and text Rectified Flow model from scratch in PyTorch. This project demonstrates the complete training loop implementation, covering the mathematical foundations and practical engineering of modern generative models.

What it covers: Rectified flow theory, velocity field parameterization, joint image-text conditioning, noise scheduling, and the training objective that enables straight-line probability paths between noise and data distributions.

GitHub

Architecture Diagram

Concept Explanation

Training Loop Implementation

Inference

Experience

Industry experience at scale

2024—Present

Machine Learning Consultant

Independent

Remote

Advising clients on ML system design and implementation. Building custom models from first principles for specialized domains and getting them into production.

2024—Present

Machine Learning Content Creator

YouTube Channel: @JustinTheJedi

Remote

Publishing state-of-the-art PyTorch implementations and first-principles math explanations of deep learning models. Topics: Rectified Flow, Continuous Normalizing Flow, Stable Diffusion XL, CLIP, GPT, Vision Transformers, ResNet, UNet, VAEs, AutoEncoderKL.

2023—2024

Senior Machine Learning Scientist

Spotter

Los Angeles, CA

Prototyped an LLM-based ideation system for YouTube creators: ingested channel context (titles, summaries, genre tags, engagement) and generated structured pitches (idea, logline, beats, audience/hook, concept), designed for downstream ranking and iteration.

Built a silver-label / proxy "ground truth" pipeline to reconstruct reference pitches from Spotter's catalog using transcripts, frame captioning, and metadata as inputs to LLMs for dataset bootstrapping.

Defined objectives and evaluation datasets for ideas and thumbnails: designed rubric dimensions (originality, hook strength, channel fit, composition, hallucination severity, narrative cohesion), ran internal labeling sessions, and treated engagement metrics (views/CTR) as noisy auxiliary signals rather than primary labels.

Developed GPT-4–based judges for ideas and thumbnails (rubric scorers, pairwise preference judges for ranking, and multi-judge ensembles with one judge per dimension); measured alignment vs human labels (correlation, agreement, variance/instability on close calls) and characterized failure modes (ranking cycles, sensitivity to prompt phrasing). Work was later deprioritized as priorities shifted.

Fine-tuned and evaluated SDXL Turbo for thumbnail generation, moving from Replicate-based LoRA workflows to Diffusers for tighter control; experimented with animated thumbnail styles (ComfyUI-augmented datasets) and used judges + human rubrics to benchmark variants for faster offline iteration across model/prompt configurations.

2022—2023

ML Consultant

Independent

Remote

Built predictive ML systems for UFC fights using scraped fighter data and custom feature engineering.

2019—2022

Machine Learning Scientist

Amazon

Seattle

Improved global streaming by 2.9% by optimizing cover art selection with contextual multi-armed bandits and deep neural networks.

Built a two-tower recommendation model that learned joint representations of customers and images, so the system could predict which artwork a given viewer was most likely to click and watch.

Developed unsupervised clustering of Prime Video titles using topic models (LDA) and Wasserstein Autoencoders on customer review data to better organize the catalog and surface long-tail content.

Implemented distributed Bayesian logistic regression from scratch in PySpark to support large-scale inference and decision-making over tens of millions of customers.

Designed and analyzed online A/B and multi-armed bandit experiments to validate model changes before full rollout.

2013—2017

Senior Software Engineer

CSRA Inc

Washington, DC

Led enterprise Java app development, Docker-based testing, and prototyping.

Technical Deep Dives

Building advanced ML architectures from first principles

45 min

NanoGPT from scratch in PyTorch

Complete implementation of a GPT-style transformer language model, covering attention mechanisms, positional encodings, and autoregressive training.

52 min

CLIP from scratch in PyTorch

Building OpenAI's CLIP model from the ground up—dual encoders for vision and language with contrastive learning objectives.

38 min

Stable Diffusion XL Objective Function Derivation

Mathematical derivation of the SDXL training objective, from variational lower bounds to practical noise scheduling strategies.

41 min

Building SDXL's AutoencoderKL from Scratch in PyTorch

Implementing the KL-regularized autoencoder used in Stable Diffusion XL—encoder, decoder, and the variational objective that enables latent space compression.

View full YouTube channel →

Featured Blog Posts

Deep dives into ML research and engineering

Series5 parts

Modern LLM Post-Training Series

A comprehensive 5-part series covering PPO-RLHF, DPO, GRPO, GDPO, and practical SFT experiments. Learn the evolution of post-training algorithms and what each method removes or modifies.

View series

Research10 min read

JanusFlow

JanusFlow unifies image understanding and generation in a single LLM backbone using rectified flow. This post explains the architecture, training stages, representation alignment, and why the shared backbone matters.

Read article

Research10 min read

Tiny Recursive Model (TRM)

TRM simplifies HRM by removing fixed-point theory, eliminating ACT's extra forward pass, and backpropagating through full unrolled recursion with no-grad refinement passes.

Read article

Research15 min read

Essential PyTorch: the 5% that shows up in almost every serious model build

PyTorch looks huge from the outside. In practice, I build Transformers, CLIP, diffusion, flows, and VAEs using a small set of tensor and module patterns over and over. This post is that core.

Read article

View all blog posts →

Education

2017—2019

MS in Computer Science

(former PhD track)

University of Central Florida

2006—2011

BS in Mathematics and Computer Science

Dual Major

Christopher Newport University

2018—2019

GEM Fellowship

National GEM Consortium