GenAI and LLMs — 140x Faster with Etched
General purpose GPUs weren’t built for the LLM era. Two Harvard dropouts are going all-in on model-specific microchips.
A new age of AI has arrived, and with it a frenzy of excitement. But who will power it? Each exchange with Generative AI requires intricate computations that demand substantial power. This power, in turn, relies on the efficiency of hardware systems that are very expensive.
Over the last several months, we have spoken to dozens of founders building Generative AI applications and infrastructure businesses. The consensus across them all is that the cost of training models and then running models (inference) is far too high.
Ongoing inference to run ChatGPT is projected to cost OpenAI $400 million per year. NVIDIA’s DGX A100 system, the workhorse that powers most LLMs, costs ~$300,000, and to deliver outputs at high speeds the costs add up. Experts estimate Microsoft incorporating ChatGPT into Bing will require $4 billion in infrastructure spend, and many Gen AI application startups are operating at a loss because their compute bills are too high.
This problem is only going to get worse. Over the coming years, LLMs will become a table stakes piece of any product’s architecture. The compute required for inference will multiply thousands of times over today's usage, and we’re already at a breaking point now. Something has to give in order for the promise of Generative AI to be fulfilled.
Enter Etched AI, which is building purpose-built chips for LLM inference—“hardware for superintelligence,” as the founders would call it. The ambition and vision of Etched is audacious and bold: Take on the best technology companies in the world to power the universe of AI inference.
Solving the cost problem will require reimagining LLM hardware. All current AI workloads run on general purpose AI accelerators, like NVIDIA’s GPUs or Google’s TPUs. These chips are way faster than normal computers, but because they need to support a variety of workloads, most of their circuitry isn’t useful for LLMs. As a result, costs skyrocket—you’re paying for circuitry you can’t use. The leading semiconductor startups were born to focus on AI needs for autonomous driving, and their foundations are not fit for the impending inference explosion.
The team at Etched is solving this problem with a novel approach to chip design that trades the flexibility of GPUs for far better performance when running LLMs. By making this tradeoff, their chips deliver more than 100x the performance of a similarly priced GPU cluster.
Etched is led by Gavin Uberti and Chris Zhu—two Harvard dropouts who operate in a stratosphere unfamiliar to most founders and certainly to us as investors. Gavin has worked with AI compilers for four years, guest lectured at Columbia, and spoken at a half dozen AI conferences; Chris has also worked in the tech industry and published original research.
As soon as we met Gavin and Chris, we knew they were special. Their vision aligned so closely with the thesis around AI hardware we had been developing internally at Primary that meeting them almost felt like fate. We are honored to be on this journey with them. They are joined by Mark Ross as Chief Architect, a veteran of the chip industry and former CTO of Cypress Semiconductor.
We believe we’re on the verge of the reinvention of software as we know it, powered by more capable and intelligent AI systems. Used correctly, we will all benefit. Our productivity will increase, the tools we use will be more helpful, and the services we pay for will satisfy our needs in a more personalized way. But this future is contingent on AI technology being affordable for everyone. Etched is a core piece of bringing this vision to life.
We are so excited to be partnering with a world-class group of investors: Koko Xu, Max Ventures, Matt Welsh and Justin Uberti (Cofounders, Fixie.ai), Michael Kohen (CEO, SparkAI), Ravi Vadrevu (Founder CEO, Kalendar AI), Justin Wenig and Nicholas Diao (Cofounders, Coursedog), Devin Wenig (former CEO, Ebay), J Zac Stein (President, Chief Product Officer, Lattice), and Rob Hayes (Partner, First Round Capital).