Justin Barry

ML research engineer & applied scientist building intelligent systems

Former Applied Scientist @ Amazon • MS Computer Science • BS Mathematics & Computer Science • BJJ Black Belt

I design and implement machine learning systems—generative and discriminative—owning both the math and the production-ready PyTorch code. I work across modalities (vision, language, structured data) to build and adapt architectures from first principles, not just fine-tune whatever API is trending, and get them ready for production.

I'm most valuable when off-the-shelf models or basic baselines stop working. I collapse messy, high-ambiguity problems into clear objectives, then turn a diagram and a loss function into a system you can ship.

I can help your team:

Own an ML product end-to-end – from problem definition and data, through modeling and experimentation, to metrics, deployment, and iteration.
Build and extend advanced models – GPT-style language models, diffusion, Rectified Flow, multimodal systems, plus classic discriminative models for classification, ranking, retrieval, and recommendation.
Debug and level up existing pipelines – improve data, losses, training loops, and metrics, and make the tricky pieces legible to your engineers.

I work remotely as an embedded research engineer—either as a full-time hire or via my LLC on longer-term engagements and scoped projects.

Available to hireBook intro callOpen to full-time and contract opportunities

Featured Project

Joint Image+Text Rectified Flow Training Loop

Building a joint image and text Rectified Flow model from scratch in PyTorch. This video walks through the complete training loop implementation, demonstrating the mathematical foundations and practical engineering of modern generative models.

Experience

Industry experience at scale

2024—Present

Machine Learning Consultant

Independent

Remote

Advising clients on ML system design and implementation. Building custom models from first principles for specialized domains and getting them into production.

2024—Present

Machine Learning Content Creator

YouTube Channel: @JustinTheJedi

Remote

Publishing state-of-the-art PyTorch implementations and first-principles math explanations of deep learning models. Topics: Rectified Flow, Continuous Normalizing Flow, Stable Diffusion XL, CLIP, GPT, Vision Transformers, ResNet, UNet, VAEs, AutoEncoderKL.

2023—2024

Senior Machine Learning Scientist

Spotter

Los Angeles, CA

Created multi-agent LLM pipelines for content generation. Fine-tuned SDXL using LoRA and DreamBooth. Combined GPT-3.5, E5 embeddings, clustering, and LOF for uniqueness scoring.

2020—2022

Professional MMA Fighter

Cage Warriors

Remote

Competed professionally in mixed martial arts as an exclusive athlete for the number one MMA promotion in Europe

2022—2023

ML Consultant

Independent

Remote

Built predictive ML systems for UFC fights using scraped fighter data and custom feature engineering.

2019—2022

Machine Learning Scientist

Amazon

Remote

Led Prime Video cover art personalization via intelligent image recommendations. Improved global streaming by 2.9% through ML-optimized visuals. Built distributed Bayesian models, multi-armed bandits, and topic model-based clustering pipelines.

2013—2017

Senior Software Engineer

CSRA Inc

Washington, DC

Led enterprise Java app development, Docker-based testing, and prototyping.

Technical Deep Dives

Building advanced ML architectures from first principles

45 min

NanoGPT from scratch in PyTorch

Complete implementation of a GPT-style transformer language model, covering attention mechanisms, positional encodings, and autoregressive training.

52 min

CLIP from scratch in PyTorch

Building OpenAI's CLIP model from the ground up—dual encoders for vision and language with contrastive learning objectives.

47 min

Vision Transformer from scratch in PyTorch

Building the Vision Transformer (ViT) architecture—patch embeddings, transformer blocks, and classification heads for image recognition.

58 min

Rectified Flow for Image in PyTorch: Train Loop (Part 1)

Complete PyTorch implementation of Rectified Flow for image generation—building the model architecture, training loop, and sampling process from the ground up.

38 min

Stable Diffusion XL Objective Function Derivation

Mathematical derivation of the SDXL training objective, from variational lower bounds to practical noise scheduling strategies.

42 min

Rectified Flow objective explained

Mathematical breakdown of the Rectified Flow training objective for generative models—from continuous normalizing flows to practical implementation.

41 min

AutoencoderKL from scratch in PyTorch

Implementing the KL-regularized autoencoder used in latent diffusion models—encoder, decoder, and KL divergence loss.

35 min

Deploying and training NanoGPT on Runpod

Practical guide to deploying GPT models on cloud infrastructure—setting up training pipelines, managing compute resources, and monitoring experiments.

View full YouTube channel →

Education

2017—2019

MS in Computer Science

(former PhD track)

University of Central Florida

2006—2011

BS in Mathematics and Computer Science

Dual Major

Christopher Newport University

2018—2019

GEM Fellowship

National GEM Consortium