BakeLab Research
Research that turns
into infrastructure.
BakeLab is the research arm of Bake AI. We publish what we build. We build what we publish.
Focus areas
Expert-level data quality & verification
Agent failure diagnosis & taxonomy
Evaluation methodology for real-world tasks
Human-AI collaboration in data production
Publications & Blog
Open-source research from our founding team, published at top venues.
Building a Foundational Guardrail for General Agentic Systems via Synthetic Data
Agent Safety via Synthetic Data
CoDA: Agentic Systems for Collaborative Data Visualization
Agentic System
TOUCAN: Synthesizing 1.5M Tool-Agentic Data from Real-World MCP Environments
Tool-Agentic Data Generation for SFT & RL
ChemOrch: Empowering LLMs with Chemical Intelligence via Synthetic Instructions
Scientific Data Generation for SFT
ImplicitPersona: Persona Data Generation for SFT & RL
Persona Data Generation for SFT & RL
VisualSphinx: Large-Scale Synthetic Vision Logic Puzzles for RL
Multi-Modal Data Generation for SFT & RL
KodCode: A Diverse, Challenging, and Verifiable Synthetic Dataset for Coding
Coding Data Generation for SFT & RL
Stronger Models are NOT Stronger Teachers for Instruction Tuning
Response Generation for SFT & DPO
Magpie: Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing
Question Generation for SFT & DPO
Collaborate with us
We work with frontier labs, universities, and research teams. If you're working on hard problems in data quality, agent evaluation, or failure diagnosis, let's talk.
Get in touch