Does it Make Sense to Invest in a Foundation Model Startup?
Foundation Model startups have all raised a ridiculous amount of money but is it even a logical bet to be taking?
Welcome to all the new subscribers that have joined us over the last week! If you haven’t already, you should check out my last article here:
If you like what you read, please subscribe to Superfluid here:
A few weeks ago, Mistral AI made huge waves, raising €105M (US$113M) within 4 weeks of starting the company.
We've seen similarly large raises for other AI startups such as Inflection AI at US$1.3B, and Cohere at US$270M.
For the uninformed outsider, these raises might seem pretty ludicrous. Why does anyone need over US$100m to start a business?
However, there's more to it than that.
In this article, we'll break down what these businesses are looking to achieve, why they need enormous sums of money to pull it off, why their efforts might just be futile and who the true winner is in this market.
Foundation Models vs Applications
"During a gold rush, sell shovels."
In the world of AI, Foundation Models are the shovels and applications that are built on top are gold. As with any core technological shift, building infrastructure that serves as the base layer for everything else is hard and unsexy, but also more lucrative than those rushing to build applications.
Despite all the hype and buzz around AI, it's a pretty thankless task to be a builder. Apps are easy to build in a weekend, but hard to defend against competition and retain users. Foundation Models on the other hand are all proprietary and highly defensible but require a ridiculous amount of technical prowess and knowledge that very few people seem to have.
Whilst both apps and models have been able to attract large sums of capital, it is deployed for different purposes. For applications, the capital is squarely used for distribution purposes and incentives to encourage users to adopt and keep using the software. Daily free credits and sign-up bonuses within these AI apps are reminiscent of the 2010s when you hardly had to pay for an Uber ride or a DoorDash delivery. As competition continues to grow, I expect VC subsidisation to continue.
For foundation models, the money is spent on two things 1) more data to use for training and 2) GPUs to train the model. As more data is acquired, more GPUs are required to handle the training process.
Buying GPUs is the more costlier effort out of the two. Inflection AI just raised US$1.3B and requires 22,000 NVIDIA H100 AI GPUs to train its model. At market rate, each GPU costs US$40,000, totalling US$880M on GPUs alone. Fortunately, NVIDIA is on their cap table and likely will provide the GPUs at a bulk discount but I still suspect it cost a few hundred million dollars to buy the GPUs. Clearly, NVIDIA will be the true winner out of all of this AI hype.
In any case, it's clear that applications are largely distribution led. The best applications might not have the best UX but will be those with superior customer acquisition strategies and for the time being, that's heavily reliant on strong incentivisation schemes. This is a game that can only be played by a few and likely results in a lot of failed investments by VCs.
On the other hand, Foundation models are trickier, and more nuanced, however, could pay off in a big way for VCs. But there's more to it than just buying a ton of GPUs and data. There are also a number of factors that can dictate whether a model gets widespread adoption or is sequestered to a certain niche (which can also be profitable).
The Foundation Model Opportunity
Whilst I've largely been against enormous funding rounds in the past, I still do believe there are legitimate reasons why someone might want to plough hundreds of millions of dollars into a foundation model company.
The payoff potential if a model receives widespread adoption is undoubtedly enormous. There is a strong chance that the future of foundation models tends towards an Android vs. Apple-type scenario which will easily result in billions if not trillions in terminal value for a company that is able to achieve such heights. Even if models were largely used in specific verticals such as financial services or healthcare, this can still be a multi-billion dollar outcome. The possibility to own a piece of such a business is a bet worth taking.
It's also quite feasible to believe that as long as models improve, providers will be able to continue expanding revenues through increasing token costs. Though slightly confusing to compare pricing between different OpenAI models, as this experiment concludes, it costs 14-29x more to use the GPT-4 API, vs ChatGPT (GPT-3.5). Whilst that's a huge uplift in pricing, I would still expect future model upgrades to extra 2-5x uplift in pricing.
Moreover, if we also expect GPU capabilities to be on a diminishing marginal return curve, this would also continue to compound defensibility for existing startups that have invested heavily in current-day GPUs. This would mean that training future models becomes increasingly cheaper as incumbent startups would have already spent heavily on a bank of GPUs, and with higher pricing, the foundation model would literally be printing cash.
However, the future potential is not all rosy. There are still a number of ways foundation model companies can fail. By far, the biggest threat to many of them is through restricting their access to data and the ability to use that data for commercial purposes. This would create a huge negative impact on the viability of training large models and restrict the use cases for these models. Personally, I do think some data sets will be restricted, I don't think this will have such a huge impact.
Outside of this, foundation models can have constrained economics in a few ways. If the cost to train models continues to increase and the revenue uplift from new models does not move linearly, this places a heavy strain on the margins of a business. On the flip side, if the cost to train new models diminishes heavily, then this lowers the defensibility of existing foundation model companies.
Additionally, the way applications interact with foundation models can also severely impact the economics and focus of the business. Through highly optimised prompt engineering, applications can significantly reduce the number of tokens required. This directly threatens the consumption-based business model that most foundation model businesses have adopted.
Fund Economics
So we've spoken about the size of the opportunity and the asymmetric upside towards backing a Foundation Model company, but how does this play when taking into account fund economics?
Taking Mistral AI's seed round as an example, the company raised US$113M at a US$260M pre-money valuation. This results in 30.3% dilution which is higher than a normal seed round, but to be expected with such a large raise. The company raised this money from 13 investors (as per Techcrunch and Pitchbook), with round being led by Lightspeed and participation from Redpoint Ventures, Index Ventures as well as several other European and US VCs.
To really dig deep into fund economics, we'll be covering the investment from three different perspectives.
Please note, most of the numbers presented are estimates. I will try and call these out clearly, but please do note this analysis is purely indicative and is not representative of how the round was actually constructed.
Before we jump in, here are a couple of assumptions I’ve made about the future capital raises for Mistral AI.
The first two rounds receive heavy markups at 5x. This is so that the company is able to raise a substantial amount of money without overdiluting the founders.
20% dilution for the next five rounds, which is required given the capital-intensive nature of these businesses
Perspective 1: Lightspeed - the lead investor
Lightspeed-specific assumptions are as follows:
Lightspeed's latest Seed - Series B (Lightspeed Venture Partners XIV) fund is US$1.98B
Lightspeed invested more than half the round with a US$60M cheque, representing 3% of the fund
From the table, we have two different scenarios. One where Lightspeed does not make any further investment in the company. If this occurs, Mistral AI needs to be valued at over US$26 billion to return their fund.
If Lightspeed maintains its pro-rata rights through to Series C, it'll be able to return the fund at a US$13 billion valuation. However, this scenario would require an additional US$840M in capital across three rounds. It's likely that this would be sourced from some of Lightspeed's growth funds making the economics a bit messy.
Perspective 2: Redpoint Ventures - a large seed fund
Redpoint-specific assumptions are as follows:
Redpoint's Seed - Series A fund is US$650M (as per Pitchbook)
Redpoint invest US$15M into this round. This represents 2% of the fund
Similar to the above, we have two scenarios for Redpoint. If the fund makes 1 investment in the company, they require the business to be valued at US$52 billion, 2x more than Lightspeed's threshold.
If Redpoint follow-on, then they are able to return the fund at Series C with a US$13 billion valuation, however, this would require an additional investment of US$210M across three rounds. Similar to Lightspeed, these additional investments might be sourced from other vehicles.
Perspective 3: Local Globe - a smaller seed fund
LocalGlobe-specific assumptions are as follows:
LocalGlobe's fund size is US$214.66M (target fund size filed with the SEC here)
LocalGlobe invests US$1.3M in the round, representing 0.6% of their fund size. I've assumed that Lightspeed, Index and Redpoint contribute US$100M to the round, with the remaining 10 smaller investors splitting the US$13M remaining.
Unlike the other two funds, LocalGlobe's scenario is not that great. In both cases, the investment fails to return the fund, even if the company reaches a US$65B valuation. This is primarily due to not having enough ownership now, whilst also being overdiluted in the future.
To justify this investment, either LocalGlobe are underwriting this investment to be a US$100B+ outcome, or they place heavy importance on the flow-on effects of being associated with Mistral through greater brand awareness, thematic deal flow and access to other AI-related deals.
Whilst the numbers above are driven by assumptions, I think they are directionally correct. At the moment, investing in a foundation model company could be a fruitful endeavour, but it's certainly an expensive journey fraught with many challenges.
Does it Make Sense to Invest in a Foundation Model Startup?
Purely on financial outcomes, it likely doesn’t make sense for a small or even mid-sized fund to make a small investment as part of a big round. You’ll need to participate in a big way and be prepared to write a cheque that is close to 5%+ of your fund if you want to significantly lower the threshold for the company to return your fund.
For large funds, the concern quickly shifts to whether they own enough of the company, and whether they are able to maintain and potentially increase their ownership stake over time. Here, it’s harder to forecast scenarios without spurious assumptions as multiple fund vehicles will be involved.
My view is that we’re still quite early in the Foundation Model journey. The models that we use today, most likely won’t be the models we use tomorrow. At the moment, the fundraises that are currently being done are effectively paying for R&D and this is a necessary part of the cycle. For 90% of funds, it likely doesn’t make much financial sense to be backing this type of company. You need a unique combination of fund size, platform and deal terms for this to be a good investment.
Large fundraises are always a contentious topic, so I’d love to hear your thoughts and opinions in the comments below!
Make sure to subscribe now to not miss next week’s article
How did you like today’s article? Your feedback helps me make this amazing.
Thanks for reading and see you next time!
Abhi