Distil Labs' Small Language Models Aim to Cut AI Agent Costs by 90 Percent

The Berlin-based developer platform automates synthetic data and fine-tuning to shrink models 400x for tasks like classification and tool calling.

By Pipe Haddad

About Distil Labs

Published 2026-05-05T15:28:55.782Z

The pitch is a procurement officer's daydream: replace a general-purpose AI model costing thousands per month with a specialized one that runs for a tenth of the price. Distil Labs, a Berlin-based developer platform founded in 2024, is betting that dream is now a 30-minute engineering task. The company automates the creation of small language models (SLMs) fine-tuned for specific jobs like classification, question answering, and function calling, claiming results comparable to large language models from models 50 to 400 times smaller [Distil Labs blog]. For teams building AI agents, the wedge is straightforward: lower latency, slashed inference costs, and a model that does only what you need it to do.

Automating the specialist model

Distil Labs' platform is designed for developers who have identified a repetitive task currently handled by a costly LLM API call. A user provides a prompt and a few dozen examples; the system then uses synthetic data generated from what it calls "agent traces" to create a training set, fine-tunes a small model, and deploys it to a hosted endpoint [Swapcard]. The company claims the entire process, from data generation to evaluation, can wrap in about half an hour with no manual labeling [Perplexity Sonar Pro Brief]. The resulting model, often just a few hundred million parameters, is intended to slot into a production pipeline where it handles that single task with high accuracy at a fraction of the compute cost. An integration with inference hosting partner Cerebrium is noted for autoscaling [Perplexity Sonar Pro Brief].

The team and early traction

Public details on the founding team are sparse, but the company lists Selim Nowicki and Jacek Golebiowski as Managing Directors [LinkedIn][Lds Studio]. Activity suggests a technical focus: the company maintains several open-source repositories on GitHub demonstrating SLMs for specific use cases like a local banking voice assistant and a model for multi-turn bash tool calling [GitHub]. Employee counts from different sources vary, indicating a small, likely engineering-heavy team in the early build phase [RocketReach]. The company has taken a seed round of undisclosed size [PitchBook]. While no named customers or detailed deployment case studies are yet public, the product claims and technical demonstrations point to an initial target of developers at tech-forward companies already running AI agents in production.

Where the proof still needs to land

The ambition is clear, but for an enterprise buyer, the questions are equally clear. The core value proposition rests on performance claims that are dramatic but, as yet, unverified by independent benchmarks or a public roster of reference customers. The platform's effectiveness hinges on its synthetic data generation and curation process,a complex technical challenge where failures could lead to brittle or biased models. Furthermore, the realistic competitive set extends beyond other SLM-focused startups.

Build in-house. A mature engineering team with strong ML ops could replicate this fine-tuning pipeline using open-source frameworks, though it would incur significant time and expertise cost.
Use a cloud giant. AWS Bedrock, Google Vertex AI, and Azure AI Studio all offer managed fine-tuning services for foundation models, backed by enterprise-grade SLAs and deep integration into existing cloud stacks.
Optimize the incumbent. Many companies may simply continue using GPT-4 or Claude via careful prompt engineering and caching, accepting the cost for superior generality and reliability.

Distil Labs' ideal customer, then, is not the Fortune 500 company standardizing on a single cloud provider. It's the VC-scale startup or digital-native scale-up that has several AI agents in production, is feeling the cost pinch from LLM API bills, and has the engineering bandwidth to manage a fleet of specialized models but not the desire to build the entire training infrastructure from scratch. For them, a 90% cost reduction is a compelling pilot project.

The next twelve months will be about moving from technical demonstration to commercial proof. Success will be measured not by model size comparisons on a blog, but by the renewal rate of its first ten enterprise contracts. If Distil Labs can show that its SLMs maintain accuracy over time with real-world data drift and that the total cost of ownership,including monitoring and updating models,stays decisively below that of continuing with general LLMs, it will have carved out a viable niche. If not, it risks being a clever solution in search of a budget owner who can sign a six-figure deal.

Sources

[Distil Labs blog] Small Expert Agents from 10 Examples | https://www.distillabs.ai/blog/small-expert-agents-from-10-examples
[Swapcard] distil labs exhibitor profile | https://app.swapcard.com/event/devworld-conference-2025-1/exhibitor/RXhoaWJpdG9yXzIxNDkzMzQ=
[Perplexity Sonar Pro Brief] Distil Labs brief
[PitchBook] Distil Labs 2026 Company Profile | https://pitchbook.com/profiles/company/863758-09
[LinkedIn] Jacek Golebiowski profile | https://de.linkedin.com/in/jacek-golebiowski
[Lds Studio] Distil Labs project page | https://www.lds.studio/distillabs
[GitHub] distil-labs repositories | https://github.com/distil-labs
[RocketReach] distil labs Information | https://rocketreach.co/distil-labs-profile_b6ed2a57c6e6a83a

Read on Startuply.vc