An Allocator’s Manifesto: Why is every single fund top quartile?
Have you ever asked the same question to yourself?
We have invested in venture funds over the past four years and have rarely seen a fund pitching to us that’s not in the top quartile. By its statistical definition, this shouldn’t be the case. And, it is ridiculous we don’t call this out, and being top quartile still passes as the core performance indicator in our industry.
Our desire to measure things often pushes us to use metrics not because they are informative, but because they just allow us to measure things. Some call this the streetlight fallacy—named after a drunk man looking for his lost keys under the streetlight, not because he lost them there, but because it is where the light is. We want to measure so bad an industry that’s often incommensurable, so we use the readily available but uninformative and inaccurate metrics.
In the case of venture returns, the quartiles are mainly preferred as robust statistical metrics of the empirical distribution of fund returns. But, in the existence of power laws, robust statistics are not robust, and the empirical distribution is not empirical. Let’s outline this below.
The Winner Takes It All
First, the median and quartiles are potentially the worst possible metrics to measure any sort of returns with a fat-tailed distribution. Most economists and statisticians would disagree with this statement, as I’ve been around them enough in my early life to be aware of their love for metrics that ignore outliers messing up their elegant models.
Statistical robustness is about finding metrics that can handle tail events, large deviations, without changing much. And, for measuring venture returns, the median is used vs. the mean because it is more robust as a statistical estimator. It represents the typical outcome (i.e., middle-of-the-road funds) vs. the mean that is sensitive to outliers (i.e., a few exceptional funds) that skew the calculation.
Conversely, for a practitioner of a power-law industry, all information is in the tails and the outliers. A robust metric doesn’t vary in response to tail events, not because it is a good metric, but because it is uninformative. Instead, we should deal with statistics that are not robust but rather sensitive to outliers, as our mere existence is about catching the next Google, Palantir, Stripe, etc. in our portfolios. In statistical distributions with thick tails, as Nassim Taleb says, the tail wags the dog. The relevant information is all in the tails, not in the body. Why would an allocator want statistical estimators to ignore the tail and instead overweight the center of the distribution, which is predominantly noise?
Some might still prefer median and quartiles as they give a better picture of the central parts of the distribution, i.e., give a better sense of average (2nd quartile) funds and good (3rd quartile) funds. And it indeed does. But, I would counter that what they do is to hide how wide the gap is between the top quartile and the top percentile. It gives the industry the credit to be mediocre, blinding allocators to the outlier returns and characteristics of outlier funds.
The median justifies spending time with good funds, but a good fund is rarely an accurate indicator for a great fund in terms of how it looks and how its returns benchmark. Why measure against good, whereas almost all value creation is driven by a few great? There’s no point in looking for a consolation prize in venture as the winner takes it all.
In-sample vs. out-of-sample
Second, in fund decks and track records, the relevant benchmarks are mostly reported by hundreds of funds that are not really a part of the mentioned sample, i.e. one reports that their fund top quartile for Cambridge Associates benchmark for the given vintage, but one is not really a part of the sample that’s measured to create that benchmark, but one is rather an out-of-sample observation.
So, there’s a hidden assumption here that the in-sample benchmarks can translate well to out-of-sample observations as an indicator of performance. And, one thing about empirical distributions in power laws is that they don’t translate well for out-of-sample inference. So, if you’re calculating the mean for the past flood levels, the empirical mean, the average of past flood levels, underestimates the real mean. The same applies to financial markets and even more so to venture capital, where data collection is often arbitrary and sample sizes are small. And, unlike an academic, a practitioner has to care about what is real rather than what is measurable.
Median and quartiles don’t have this problem as much given their robustness we touched upon in the first section, so in-sample median translates much better to out-of-sample median, but that is precisely because median is an uninformative metric for power law that trims the impact of outlier companies and funds that return almost all the money in the given vintage.
Post your best shots
As one posts their best shots online, there’s selectivity in when and with which metrics funds fundraise. If one knows that they’ll be measured against a given top quartile benchmark that they are well aware of, they have many options to game it.
First, one can fundraise once their portfolio gets a few markups. So, the fund is top quartile half the time, but only fundraises in periods where it’s top quartile.
Second, one can be a little bit richer in portfolio valuation, saving the markdowns to post-fundraise. As they know benchmarks, they can add a few touches here and there to become top quartile.
Third, one can report and highlight selectively. They might have some funds that are not top quartile, but it’s up to them to emphasize others. If they’re top quartile IRR but not TVPI, they can highlight that instead.
Fourth, one can pick the benchmarks that make them look good. Often, especially after applying a few of the steps above, the gap between the top quartile and the fund’s performance is narrow. So, one can cherry-pick benchmarks across various data providers to find one that fits the bill.
In short, in a world where we blindly follow an uninformative metric that does not apply well to out-of-sample inference, there are many ways to game it.
So, where to look instead?
I won’t give you any fancy mathematical formulas, not because there aren’t any. There’s a wide enough literature of power law statistics available for those interested, and I’ll also be writing more about that in the coming period.
I won’t give you any fancy mathematical formula, just because ours is a world where outliers possess all the signal. And, the good thing about outliers is that there are very few of them. Every year, one only needs to study a handful of funds and their portfolio deeply, rather than trying to aggregate them into a number where they get lost in all the garbage and noise.
Still, the bad thing is that these funds are harder to find in public databases or to come across daily, especially if one is spending most of the time with funds that are not outliers, just because these are top quartile. This makes finding outliers and paying the deserved attention hard enough.
In an industry where the few winners take it all, one needs to spend time and attention on outliers and potential outliers, and zoom in on their qualities rather than aggregates. Spending lots of time analyzing what’s good shapes an investor’s pattern recognition in the wrong way. In venture, the top quartile looks very different than the top percentile, and the good looks very different than the great.
Interesting read! would love to get your views on which robust metrics to track to study the great funds. How can we judge investor pattern recognition?
Hey Yavuzhan, insightful article as usual! While power law dynamics clearly dominate venture capital returns, I think median and quartile metrics still have practical value in specific contexts, especially in early-stage fund screening? I was thinking it from three perspectives:
Median as a Baseline Filter: Rather than being a weakness, the median’s robustness to outliers can actually be a strength for LPs. For example, it helps quickly exclude systemic underperformers (such as funds below the 25th percentile) and provides a realistic baseline for emerging managers, whose early J curve struggles don’t always mean long-term underperformance.
Quartiles and Fund Survival: Quartile positioning, while imperfect, often signals fund survival rates, which are a prerequisite for accessing outlier returns. For instance, funds in the lower quartiles with slower deployment rates are statistically more likely to exit the game before capturing power law dynamics.
A Balanced Approach: Instead of discarding these metrics, they could act as an initial filter for eliminating weaker funds, followed by deeper due diligence focused on power law potential and GP alignment.
Curious to hear your thoughts. Do you think ignoring median and quartile metrics entirely might introduce unnecessary noise in fund selection, or are there better alternatives for this initial phase? Appreciate your insight always!