The first sign an AI agent is failing is often the last. A customer support bot degrades, a coding assistant introduces subtle bugs, a financial analyst model drifts. For founders Jerry Zhang and Cole Gawin, both ex-AI engineers, that was the pain point they aimed to solve. Their startup, Lemma, launched in 2025 with a simple, technical wedge: an observability platform that doesn't just watch AI agents fail, but teaches them to fix themselves [Startup Intros, 2025].
The Observability Wedge
Lemma's bet is that the next generation of AI applications will be adaptive, learning continuously from user feedback and production outcomes. The company's platform is designed to detect performance drifts, pinpoint the exact failure step in an agent's reasoning chain, and then generate optimized prompts or code fixes. Those improvements can be automated via API or delivered as pull requests to a developer's codebase [Startup Intros, 2025]. The goal is to turn static, brittle AI models into self-improving systems, a concept the founders argue will create compounding gains over time. It's a product built for developers and enterprises that are moving beyond simple chat interfaces into complex, multi-step agentic workflows.
Building in Public, on GitHub
While the company is in its early days, its technical footprint is visible. The founding team, both USC dropouts, has prioritized developer tooling from the start [LinkedIn, 2026]. Lemma's GitHub organization, uselemma, hosts public SDKs for TypeScript, Python, Go, and a CLI, all built on OpenTelemetry for tracing [GitHub, 2026]. CTO Cole Gawin's personal GitHub profile, chroline, actively links to the company's work, signaling a focus on engineer-first distribution [GitHub, 2026]. This open-build approach is a common playbook for infrastructure tools aiming to win developer trust before landing large enterprise contracts. The public code serves as both a proof of technical depth and a lead generation engine.
The Early-Stage Calculus
Lemma's current position is defined by its Y Combinator backing and its pre-seed capital. The company was part of the YC F25 batch and has raised $500,000 in total disclosed funding to date [Y Combinator, 2026] [PitchBook, 2026]. The investor list currently begins and ends with Y Combinator, a typical pattern for companies at this stage. The startup was also featured on a Forbes list highlighting top companies from that YC batch, a reputational boost in a crowded field [LinkedIn, 2026]. The path from here involves converting GitHub stars into paid API calls and proving that the market for agent observability is large enough to support a standalone company.
The competitive landscape is already populated with well-funded players, which presents both validation and a significant go-to-market challenge. Lemma is entering a space with established tools for monitoring and evaluating AI applications.
LangSmith | 1 | Notable Competitor
Arize | 1 | Notable Competitor
Langfuse | 1 | Notable Competitor
To differentiate, Lemma must convince developers that its focus on self-improvement and automated remediation is a distinct category, not just a feature of existing LLM ops platforms. The risks at this stage are straightforward:
- Proving the wedge. The platform must demonstrate that automated prompt optimization and code fixes deliver tangible ROI beyond basic monitoring dashboards.
- Escaping stealth. The company shares its name with multiple unrelated firms in adtech and other sectors, which could complicate discoverability [Perplexity Sonar Pro, 2025].
- Scaling the team. With just two founders based in San Francisco, the next critical hires in sales and engineering will define its execution tempo [Y Combinator & LinkedIn, 2026].
The Next Twelve Months
For Lemma, the coming year is about moving from a promising SDK to a product with measurable traction. The key signals to watch will be the first disclosed enterprise pilots, the growth of its open-source community, and the announcement of a seed round to scale the team. The $500,000 pre-seed from Y Combinator provides runway, but the clock is ticking to show that developers building the future of agentic AI are willing to pay for the tools to keep those agents reliable and improving. The question for the market is whether automated debugging represents a new, must-have layer in the AI stack, or simply a nice-to-have feature that gets absorbed by larger platforms. For now, Zhang and Gawin are betting their code, and Y Combinator's capital, on the former.
Sources
- [Startup Intros, 2025] Lemma: Funding, Team & Investors | https://startupintros.com/orgs/uselemma
- [LinkedIn, 2026] Jerry Zhang - Lemma (YC F25) | LinkedIn | https://www.linkedin.com/in/jerry-n-zhang/
- [GitHub, 2026] Lemma Labs · GitHub | https://github.com/uselemma
- [GitHub, 2026] chroline (Cole Gawin) · GitHub | https://github.com/chroline
- [Y Combinator, 2026] Lemma: Reliability platform for AI agents | Y Combinator | https://www.ycombinator.com/companies/uselemma
- [PitchBook, 2026] Lemma (San Francisco) 2025 Company Profile | https://pitchbook.com/profiles/company/904260-16
- [Perplexity Sonar Pro, 2025] Lemma (uselemma.ai) Research Brief
- [LinkedIn, 2026] Cole Gawin - Lemma (YC F25) | LinkedIn | https://www.linkedin.com/in/colegawin/