How many companies one should have in a venture capital portfolio? It’s one of those questions where many people have opinions but nobody really knows the answer. It’s a tricky question to answer, especially considering the lack of data on private markets and power-law return distribution of venture capital.
The lack of a clear framework makes this question prone to attract bold and untestable theories by investors. Some say “You need to have a large/diversified portfolio since venture capital is a power-law game.”, and others say “You need to have a small/concentrated portfolio since venture capital is a power-law game.” Here the power-law becomes the Aether of the theory, making the theory untestable, hence making it a tautology until it is proven wrong at some future point. If you ask what Aether is, you can read my previous article outlining the concept.
While these are in mind, my humble intention is far from coming up with an answer to the question. It is rather putting forward a new way of approaching the problem at hand and offering some concepts that could help to attack it.
We need better than “giving money to talented people”
The venture capital community doesn’t have nuanced thinking on portfolio construction. The only existing works, by few people who actually have proprietary data, compare the average returns for large portfolios vs. small portfolios and come up with faulty conclusions that cannot even catch the sophistication of existing work on portfolio theory in the 1950s.
On the other side of the spectrum, people with traditional finance schooling apply existing irrelevant theories to think about venture capital. The existing theories are unfit to solve the allocation problem in venture capital for the reasons to be mentioned in the following section.
We need a more nuanced portfolio framework than simply saying “I give money to talented people.” The question is a complex one touching many different disciplines, and the answer is unlikely to be coming from echo chambers of the venture community.
Defining the portfolio construction problem in private markets
The existing finance theory on portfolio construction is modeled with public markets in mind and it is miles away from addressing the portfolio construction problem in venture capital for many reasons, some immediate ones are as follows:
Venture capital returns demonstrate a power-law distribution, unlike the normal return distribution that is assumed in modern portfolio theory. It is another question whether the normality assumption applies to any type of return at all.
Venture capital is a game of sequential and independent decisions over 2-3 years that create a portfolio combined. Unlike public market investors, you don’t have information and access to all opportunities at a given time to solve a concurrent optimization problem.
Venture capital is an asset class where the asset picks the investor, i.e., investors cannot invest in all opportunities they are willing to. In addition, most of the time, the amount of investment doesn’t reflect the investor’s desired exposure. It’s rather a result of the capital need of the company and how much allocation the investor can secure following the competition with other investors.
Hence, the portfolio construction problem in venture capital is far from being a simple optimization problem. It has elements of optimization problem but under power law, search problem due to sequential decisions, and matching problem due to difficulties in securing allocation. In addition, there’s lots of signaling going on between investors and entrepreneurs to seduce each other which makes it also a signaling problem.
For decision-making on investment opportunities, the investor defines her particular filtering, i.e. a group of conditions that investment opportunities need to satisfy. Then, she sequentially sees new opportunities and invests in the ones that satisfy this filtering based on her analysis. The filtering implicitly generates a selectivity threshold for investors. In a universe of 10,000 companies, some investors might apply the filtering of “B2B software startups from Y Combinator” that 100 companies satisfy. Some other investors may prefer the filtering of “B2B software startups from Y Combinator with technical CEOs” and generate a higher selectivity threshold that only 50 companies can satisfy.
Smaller portfolios and being selective result in high information portfolios
Boltzmann defines entropy as the multiplicity of equivalent states, i.e., the number of equivalent microstates that could produce a given macrostate. So, entropy is basically how likely a state is. César Hidalgo uses the example of Bugattis to exemplify a real-world example of entropy. In the set of all possible rearrangements of atoms, very few of these involve a Bugatti in perfect condition. The group of Bugatti wrecks, on the other hand, is a configuration with a higher multiplicity of states (higher entropy), and hence a configuration that embodies less information. The destruction of the Bugatti is the destruction of information. The creation of the Bugatti, on the other hand, is the embodiment of information. As in the Bugatti example, we see that disordered states tend to have high multiplicity and hence high entropy.
Let’s analyze the portfolio construction problem from an information theory perspective.
Macrostate → Portfolio
Microstates → Individual investment opportunities
The investor faces 10,000 investment opportunities
The investor has her filtering to apply to come up with investment decisions, e.g., B2B software startups from Y Combinator
100 investment opportunities satisfy the filtering “B2B software startups from Y Combinator”
Investor A can select a portfolio of 5 companies in 75,287,520 (100C5) different ways while she can select a portfolio of 10 companies in 17,310,309,456,440 (100C10) different ways.
If she uses more nuanced filtering (e.g., B2B software startups from Y Combinator with technical CEOs), she may end up with fewer companies, say 50, satisfying the criteria. This time, she can select a portfolio of 5 companies in 2,118,760 (50C5) different ways while she can select a portfolio of 10 companies in 10,272,278,170 (50C10) different ways. Two observations follow from the simple exercise:
Adding an extra condition to the filtering, thus being more selective, decreases multiplicity, hence decreasing entropy and creating a higher information state.
Limiting the portfolio size, thus having a more concentrated portfolio, decreases multiplicity, hence decreasing entropy and creating a higher information state.
Smaller portfolios and more selective filtering create high information portfolios. In case you ask whether it is something good or bad, the answer is that it depends on the quality of the information.
High information portfolios can be good or bad depending on the information quality
When you pick a lower number of companies, you create a high information state. Still, this information can be random or plain wrong and everything depends on the quality of the information you impose on a given state. A Bugatti car is a high information stage but so is a Renault. “B2B software startups from Y Combinator with technical CEOs” is a higher information state than “B2B software startups from Y Combinator”, but so is “B2B software startups from Y Combinator with blue-eyed CEOs”. The latter doesn’t imply a better portfolio because being blue-eyed doesn’t correlate with company success, at least not that I know of.
In a power-law environment assuming only 4 of 10,000 companies have the potential to achieve a 1,000x return and make a real difference for investors:
Imposing conditions that highly correlate with the probability of success creates high information portfolios with high chances of outstanding returns. Let’s say you impose the condition of “B2B software startups from Y Combinator”, 100 of 10,000 companies satisfy this condition and 3 of the 4 exceptional companies are within this set. Then, you are selecting from a way better pool of companies. This is what people mean when they say: “You need to have a small/concentrated portfolio since venture capital is a power-law game.”.
Imposing conditions that don’t correlate with the probability of success increase the information level of the portfolio but it doesn’t increase the expected return, rather increasing the volatility of the outcome. Let’s say you impose the condition of blue eyes and say only 1 of 4 exceptional companies has blue-eyed CEOs, then you are limiting your chances of success to only one company where you could distribute your eggs between all 4 of the exceptional companies to get the same expected return with way more favorable risk profile. Imposing random information makes the portfolio more polarizing in a random way. This is what people mean when they say: “You need to have a large/diversified portfolio since venture capital is a power-law game.”.
At the other end of the spectrum, you can also impose conditions that inversely correlate with the probability of success like “having a CEO with a long enterprise career”. Then you are reaching a higher information state but get lower expected returns with higher risk. Imposing bad information makes the portfolio more polarizing in a bad way. As long as the signal you use for decision-making doesn’t work out at the moment, you’re cannibalizing both your returns and diversification, making you very handicapped in power-law return distribution.
The implication is that a high concentration portfolio, thus a high information portfolio, tends to be polarizing, either good or bad depending on the quality of your information.
Good investors know what they don’t know
Good investors know what they know, but they also know what they don’t know. In order to create high information and polarized portfolios, they impose new information. This new information may be their theses on the future of certain markets, their analyses of business models, or pattern recognition of founders. However, they don’t take the quality of the information for granted and don’t impose information they don’t trust. In such cases, they can be comfortable with a portfolio at a lower information state.
If you’re concentrating on a few companies, it is good to ask yourself: “What beliefs do I have that others don’t share and how much do I trust in the quality of my beliefs?”. If you’re diversifying, it is good to ask: “What beliefs of others that I don’t share lead me to lift these conditions to achieve a better risk profile?”.
Growth investors usually have more concentrated portfolios. When you ask about what information they impose that early-stage investors cannot, the answer is the extra information that is accumulated over the years and is clearly reflected in company financials. That makes sense. The follow-on investments of venture capital investors usually go to a handful of companies. When you ask about what information they impose that others cannot, the answer is the years of ongoing relationship with the team and the insider information it brings. That makes sense. Many emerging managers prefer diversified portfolios for initial funds as they build pattern recognition over time and increase concentration for their successive funds. That makes sense.
It is also okay if you don’t share the idea that many of the agreed-upon success signals of past decades will translate to the current world so you relax these constraints to come up with a larger portfolio, as long as you are deliberate in the strategy and know what you don’t know better than the others.
Every investor is different, some have a very ingrained thesis on businesses and people, and it is normal this will lead to super-selective filtering and a lower number of companies in the portfolio. As long as the thesis is accurate enough, these people become legendary investors that others follow. Some don’t have super sophisticated theses and a good enough filtering with 2-3 high-quality conditions that can lead to repeated success, although usually to more moderate success. Anecdotally, it is usually a failure when the latter tries to play the strategy of a concentrated portfolio with a superficial thesis or the former relaxes certain constraints to diversify.
Investment theses allow information to endure
César Hidalgo argues that the existence of a ‘solid’ state of matters creates places where information can hide from destruction and grow. In this way, information can endure, can be recombined, and grow continuously. We transfer information in the form of writing, objects, artifacts, etc., in solid form over millennia whereas that would be impossible to do in liquid or gas.
Analogously, we can think of investment theses as solid structures of investing world that allow information to endure, get recombined, and have continuity. Investing is a pattern recognition game in an ever-changing environment with limited information and feedback. Ad-hoc information tends to get lost and lose its effectiveness very quickly. However, certain schools of investing like growth investing, early-stage investing, or investment theses of certain firms (Founders Fund, USV, etc.) form solid structures for information accumulation and recombination to iterate on existing beliefs so information endures.
Do we have a solution yet?
Absolutely not. And, information theory is a seemingly irrelevant place to look for a solution. However, everything we do as an investor is about access to information, analysis, synthesis, and decision-making, given a particular information set. Thinking this way, information theory is a proper toolkit to address the problem at hand combined with economic theory addressing the decision-making element of it.
After all, the portfolio of an early-stage investor is a statement of how she thinks the future will unfold based on past information and some imagination. And, I have a hunch saying that the core insights on this portfolio problem won’t be coming from the finance literature. But still, once we find the answer, people will call it ‘finance’.