Skip to content

Lucas Burgett

I'm studying math and computer science at Stanford. I'm super interested in physical AI systems. I love learning about VLA's, Reinforcement learning, and how to deploy reliable policies in the real world.

Math + CS @ Stanford · Intern @ Parametric (F25) · Stanford & San Diego, California

Now — Parametric (YC F25)

Software Engineering Intern

Building out the policy eval pipeline, and designing the metrics and statistics that decipher if a policy is usable in the real world.

Now — Fan Lab, Stanford University

Research Assistant

Contributing to the development of MetaChat 2.0, a multi-agentic framework for autonomous nanophotonic device design.

01  ·  Projects & Personal Research

Research and projects, in and out of class

VLA Memory

RL-trained memory for long-horizon robot manipulation

A hierarchical VLM + VLA system for robots that have to remember across tasks. A Qwen2.5-VL-7B planner is fine-tuned with GRPO to pick the keyframes that actually matter and issue the next subtask; a frozen π₀.₅ policy carries out the low-level motion.

  • Swaps the imitation learning used in MemER (ICLR 2026) for reinforcement learning. The planner is rewarded for whether the task gets done, not for copying which keyframes a human looked at.
  • Same architecture, backbone, and benchmark as the prior work, so the training algorithm is the only thing that changes.
  • GRPO loop: sample several rollouts per prompt with the frozen policy, score each on episode success, and update the planner with a group-normalized advantage and a KL anchor to the supervised model.

GRPO · Reinforcement Learning · Qwen2.5-VL · π₀.₅ · JAX / openpi · Modal · A100

RoboMME evaluation harness ↗

A successful eval rollout: π₀.₅ carrying out a subgoal the memory planner issued on a RoboMME permanence task. The on-screen subgoal is the planner's output.

Coinvest

An AI investment-thesis research tool: type a thesis, get back a structured report with data-driven confidence scores, company comparisons, historical parallels, and the conditions that would prove it wrong. A multi-agent Claude pipeline grounds every score in live yfinance and NewsAPI data, with scheduled re-research, portfolio tracking, and a daily morning brief.

Claude API · Multi-agent · FastAPI · React · Postgres · Fly.io

Private repository

Latent Rectified-Flow CT Reconstruction

Denoising low-dose medical scans with a latent flow model

Low-dose CT and X-ray scans are noisy, and standard diffusion models need hundreds of steps to clean them. This model learns a straight-line flow between paired low-dose and normal-dose images in a compressed latent space, so a scan reconstructs in 5 to 10 solver steps.

  • A frozen VAE compresses each scan; a learned vector field transports the noisy latent toward the clean one along a straight path, trained by velocity matching.
  • Adds a per-pixel uncertainty map, from SDE-variance sampling or an exact log-likelihood, so a radiologist can see where the model may have invented detail.
  • Measured against a residual U-Net baseline on the Mayo Clinic low-dose CT data with SSIM and FID.

Rectified Flow · VAE · PyTorch · Uncertainty · SSIM / FID

CodeSentry

AI code review for AI-generated code, shipped as a GitHub App. It pairs Semgrep (16 custom rules for common AI mistakes) with Claude behavioral analysis to catch bugs a static linter misses, then posts a risk-scored PR comment. LLM flags only survive if Semgrep backs them up, which kills most false positives.

GitHub App · Semgrep · Claude · FastAPI · SQLite

View source ↗

Reproducible RL Pipeline

A template for deep RL experiments that actually reproduce: deterministic seeding, Hydra configs, version locking, experiment logging, CI, and Docker. PPO agents on CartPole, LunarLander, and Reacher, with mean ± std across seeds. Two runs of the same config give bit-for-bit identical results.

PPO · Stable-Baselines3 · Hydra · MuJoCo · CI/CD

View source ↗

Nano Defect Detector

A lightweight PyTorch pipeline for spotting nanoscale defects in SEM and TEM images. It returns OK/NG classifications and pixel-level defect heatmaps in under 10 ms per 512×512 tile on a laptop, served through a FastAPI and Gradio viewer with an ONNX runtime path.

PyTorch · OpenCV · Anomaly detection · FastAPI · ONNX

View source ↗

Personalized Writing-Style Tool

A multi-agent system that learns a person's writing style from samples and generates content that matches it with the Claude API, then runs a refinement loop against GPTZero feedback to keep the writing natural.

Claude API · Multi-agent · NLP

Source unavailable

02  ·  Experience

Where I've worked

From production computer vision to startup research. This summer, reinforcement learning for robots at Parametric.

Summer 2026 – present

Software Engineering Intern · Parametric

San Francisco, CA · YC F25

Building and deploying reinforcement learning models for robotics, across the full ML lifecycle from training to production.

Reinforcement Learning · Robotics · MLOps

Winter 2026 – present

Research Assistant · Fan Lab, Stanford University

Stanford, CA

Contributing to MetaChat 2.0, a multi-agent framework for autonomous nanophotonic device design, and building the LLM evaluation set that measures it.

Multi-agent · LLM eval · Nanophotonics

Summer 2025

Engineering Intern · Platform Science

San Diego, CA

Built a proof-of-concept model that computes Time-to-Collision from live vehicle video at 80% accuracy, using TensorFlow and OpenCV, with model training on AWS SageMaker and data in Snowflake.

Computer Vision · TensorFlow · AWS SageMaker · Snowflake

Summer 2023

Product Manager Intern · Treeline Interactive

San Diego, CA

Worked across product development, market analysis, and roadmapping, and helped launch Five Iron Golf's online booking software.

03  ·  About

A little more

In my free time, I love working out, golfing, seeing the world, and meeting new people. I'm also half Brazilian, so I try to go visit my family there as much as I can.

Education

Stanford University

B.S. Mathematics · Minor in Computer Science

2024 – 2028 · GPA 3.80 · Xfund Ethics Fellow · Stanford AI Club · Blythe Fund · Startups To Join

Selected coursework

  • CS 224R Deep Reinforcement Learning
  • CS 231N Deep Learning for Computer Vision
  • CS 109 Probability for Computer Scientists
  • CS 107 Computer Organization & Systems
  • Math 104 Applied Matrix Theory
  • Math 53 ODEs, Linear Algebra & Fourier Methods

Toolbox

Languages
Python · Java · C++ · C · MATLAB · SQL
ML & RL
PyTorch · JAX · TensorFlow · Stable-Baselines3 · GRPO / PPO · Hugging Face · OpenCV
Infrastructure
Modal · AWS SageMaker · Docker · Fly.io · Snowflake · Hydra · W&B
Product
React · FastAPI · Postgres · Claude API
Spoken
English · Portuguese