Teodor Duevski () and Viacheslav Bazaliy
Additional contact information
Teodor Duevski: HEC Paris
Viacheslav Bazaliy: Boston Consulting Group
Abstract: We revisit the measurement of capital allocation concentration in the venture capital (VC) industry and highlight the shortcomings of the widely used Herfindahl-Hirschman Index (HHI). We show that HHI-based concentration measures are sensitive to discrepancies in VC database coverage, industry taxonomies, and classification granularity, yielding divergent trends even when applied to identical VC portfolios. To overcome these limitations, we develop a novel concentration metric using large language model (LLM) text embeddings that capture the semantic similarity among financed startups beyond predefined industry classifications. Using matched PitchBook and Crunchbase data, we validate our approach and show that OpenAI embeddings outperform alternative models on signal-to-noise and retrieval tasks. Applying this methodology, we document that aggregate VC capital allocation concentration has increased more sharply than suggested by HHI measures. A novel decomposition shows that 40% of the growth in capital allocation concentration stems from an increase in within-sector similarity among founded startups—an effect that industry-based measures do not capture.
Keywords: venture capital; Herfindahl–Hirschman index; text embeddings; capital allocation; investment concentration
41 pages, April 17, 2025
Questions (including download problems) about the papers in this series should be directed to Antoine Haldemann ()
Report other problems with accessing this service to Sune Karlsson ().
RePEc:ebg:heccah:1557This page generated on 2025-06-10 15:17:03.