The first thing you notice is the price tag. Not the one on the screen, but the one that isn’t there. You upload a model, define a budget, and the system,Tensor Cloud, they call it,begins its quiet search. It is looking for the cheapest available GPU cycle across a dozen different providers, a spot instance about to be preempted, a sliver of unused silicon in a data center you’ve never heard of. The promise is that you, the researcher, shouldn’t need to know which power plant your electricity comes from [SF Tensor, 2025]. The bet is that for the teams building the next foundational models, the most valuable abstraction isn’t just speed, but cost.
The wedge of cheaper cycles
San Francisco Tensor Company, founded in 2025 by brothers Ben, Luk, and Tom Koska, is entering a crowded AI infrastructure market through a narrow, pragmatic door: the invoice. Their primary product, Elastic Cloud, is a managed platform designed to automate the hunt for affordable compute, claiming to cut costs by up to 80% by dynamically routing workloads to the most cost-effective hardware and gracefully handling spot instance interruptions [Y Combinator, 2025]. This is the initial wedge,a straightforward answer to the most painful, recurring line item for any AI lab scaling beyond a handful of GPUs.
But the cost lever is paired with a performance one. The company’s other flagship offering is an automatic kernel optimizer. Instead of requiring engineers to hand-tune low-level code for specific GPU architectures, SF Tensor’s software models the hardware topology,memory hierarchies, core counts, bandwidth,and automatically restructures computations to run faster. The founders claim it often outperforms manual implementations [Y Combinator, 2025]. This one-two punch of cheaper and faster execution forms the core of their value proposition: let the infrastructure worry about the hardware so the research team can focus on the model.
A stack built by practitioners
The team’s background reads less like a corporate resume and more like a project log. The three Koska brothers previously trained their own foundational world models, scaling runs to thousands of concurrent GPUs [Y Combinator, 2025]. This practitioner experience is the bedrock of their ambition. They are not just selling cloud credits; they are building the stack they wished they had.
That stack includes EMMA, a hardware-aware programming language with native async/await syntax for GPU operations and builder functions for MLIR code. It is designed for portability, offering support for NVIDIA, AMD, and Vulkan backends [Bizety, October 2025]. The vision, articulated in a company manifesto, is a unified layer where code is decoupled from the underlying silicon, enabling both performance portability and financial efficiency [SF Tensor, 2025]. This technical ambition has attracted early backing from Y Combinator (Fall 2025 batch), Susa Ventures, and angels including Paul Graham, with YC Managing Partner Harj Taggar publicly praising the team as "exceptionally talented" [Harj Taggar on X, 2026].
| Founder | Role | Noted Background |
|---|---|---|
| Ben Koska | Founder & CEO | Optimizing GPU kernels [SF Tensor, 2025] |
| Luk Koska | Co-founder | Optimizing AI infrastructure, YC F25 [Y Combinator, 2025] |
| Tom Koska | Co-founder | Co-founded SF Tensor [Bizety, October 2025] |
The scale of the ask
For all its technical promise, SF Tensor is making an extraordinarily difficult ask of its potential customers. It is requesting that AI labs, whose primary competitive advantage often hinges on the raw speed and reliability of their training runs, outsource the deepest layers of their computational stack,the kernels, the hardware scheduling, the cloud procurement,to a five-person startup [Y Combinator, 2025]. The risks are not merely technical but existential.
- The trust deficit. Major labs like OpenAI, Anthropic, and xAI operate at a scale where downtime is measured in millions of dollars. Handing over kernel optimization and cloud orchestration requires a level of confidence typically earned over years, not a seed round.
- The integration burden. Adopting EMMA or the optimizer means rewriting or wrapping critical code paths. For teams already using highly optimized frameworks like Triton or cuDNN, the switching cost is monumental unless the performance delta is overwhelming.
- The market squeeze. They are competing against both hyperscalers (AWS, GCP, Azure) with deeply integrated AI suites and a swarm of other well-funded infra startups focusing on cost optimization or performance. Differentiation must be profound to avoid becoming a feature, not a platform.
The company’s current hiring push for multiple founding engineering roles suggests a race to build out the robust MVP needed to even begin addressing these concerns [SF Tensor, 2025]. Their success hinges on proving that their combined stack delivers not just incremental savings, but a categorical leap in efficiency that justifies the operational risk.
The question beneath the code
The real test for SF Tensor won’t be a benchmark on a single GPU cluster. It will be whether the cultural mindset of AI research is ready to prioritize frugality over raw, predictable power. For a decade, the narrative has been one of scaling at any cost, buying the biggest chips and running them continuously. SF Tensor’s entire premise is an argument for a new kind of sophistication: intelligence applied not just to the model architecture, but to the physics and economics of the compute itself. They are betting that the next frontier in AI isn’t just a bigger model, but a smarter, more portable, and ultimately cheaper way to build it. The product they are shipping is a kernel optimizer and a cloud broker. The question they are implicitly answering is whether the era of brute-force compute is finally giving way to an era of elegance.
Sources
- [Y Combinator, 2025] SF Tensor: Infrastructure for AI labs to focus on research | https://www.ycombinator.com/companies/sf-tensor
- [Bizety, October 2025] Startup SF Tensors is Reinventing AI Infrastructure | https://bizety.com/2025/10/08/startup-sf-tensors-is-reinventing-ai-infrastructure/
- [SF Tensor, 2025] Introducing The San Francisco Tensor Company | https://sf-tensor.com/news/introducing-sf-tensor
- [SF Tensor, 2025] Tensor Cloud Launch | https://sf-tensor.com/news/tensor-cloud-launch
- [Harj Taggar on X, 2026] Post praising SF Tensor team | https://x.com/harjtaggar/status/1985781862422433876
- [SF Tensor, 2025] SF Tensor Careers | https://sf-tensor.com/careers