“The value of a non-pecuniary (free) product is inherently difficult to assess,” wrote Manuel Hoffmann, Frank Nagle, and Yanuo Zhou in their recent working paper, “The Value of Open Source Software.” “A pervasive example is open source software (OSS), a global public good that plays a vital role in the economy and is foundational for most technology we use today.” The paper reminded us that in 2011, VC Mark Andreessen famously argued that software is eating the world. Then two years ago, VC Joseph Jacks added that open source is eating software FASTER than software is eating the world.
“Other recent studies have come to similar conclusions showing that open source software (OSS) — software whose source code is publicly available for inspection, use, and modification and is often created in a decentralized manner and distributed for free — appears in 96% of codebases, and that some commercial software consists of up to 99.9% freely available OSS,” said the paper. “Although in its early days OSS frequently copied features from existing proprietary software, OSS today includes cutting edge technology in various fields including artificial intelligence (AI), quantum computing, big data, and analytics.”
But, despite the increasing importance of OSS to the economy, its overall impact has been difficult to quantify. The reason is that the value created by a good or service is traditionally measured by multiplying the price p times the quantities sold q, but with OSS, the p is generally zero since open source code is publicly available, and its actual usage q has been very hard to measure.
In the past few years, a number of efforts have aimed to evaluate the economic value of OSS. For example, in September of 2022, the European Commission (EC) published a report on “The impact of open source software and hardware on technological independence, competitiveness and innovation in the EU economy.”
The study estimated that in 2018, there were at least 260,000 OSS contributors to GitHub in EU countries, and that in 2018 companies located in the EU invested around €1 billion in OSS. An analysis of data from EU member states indicated that the overall economic impact of OSS in 2018 was between €65 and €95 billion.
More recently, in March of 2023, Linux Foundation Research published Measuring the Economic Value of Open Source, a study led by UC Berkeley professor Henry Chesbrough. To understand the motivation and estimated economic value that have led companies to embrace OSS, Chesbrough and his collaborators conducted a survey of senior executives, mostly from Fortune 500 companies.
“Considering the various questions and their respective responses, it is quite clear that respondents perceive OSS to have substantial economic value,” wrote Chesbrough in conclusion. “The perceived benefits clearly exceed the perceived costs for a strong majority of respondents — 60% to 75%, depending on the specific question. And the ratio of benefits to costs appears to be rising for nearly half of the respondents, while only 16% felt that the ratio was declining. This strongly suggests that the value of OSS will increase even further in the future for most participating organizations.”
The study conducted by Hoffman, Nagle, and Zhou aimed to gain insights into the use of OSS by analyzing data from two primary sources, the Census II of Free and Open Source Software – Application Libraries, and the BuiltWith dataset.
The Census II project relies on data from software composition analysis (SCA) companies to identify the most widely used OSS at tens of thousands of firms around the world. SCA tools are used to detect all 3rd party software component in use within a software product in order to ensure that the product is not violating OSS or other IP licensing requirements, to help reduce the risks of security vulnerabilities, and to identify obsolete components. As a byproduct, SCAs are able to track all the OSS code used by the customers of a software product, as well as any additional products those customers might build.
The BuiltWith dataset uses scans of nearly nine million websites to identify all the underlying technology being used in those websites, including OSS libraries. While the Census II data is focused on the OSS that’s built into the software a company sells, i.e., the supply-side value, the BuiltWith data is focused on the OSS that’s actually used in a company’s website, i.e., the demand-side value.
“In aggregate, these two datasets combined create the most complete measurement of OSS usage to date. Further, by focusing on OSS that is widely deployed and used by firms, rather than considering all the projects that exist in an OSS repository, we enhance the methodologies of prior studies by reducing the likelihood of measurement error stemming from projects that are posted as publicly available OSS but are not actually used in any practical manner. Not accounting for this measurement error would lead to overestimation of the actual value of OSS as projects that are widely used would be valued in the same way as projects that are not used at all.”
To estimate p, the supply-side economic value of recreating the open source software, the authors calculated the labor cost for an individual firm to recreate an OSS package by measuring the lines of code of the package, estimating the person-hours it would take to write the core from scratch, and then using global wage data to get an accurate estimate the labor costs a firm would incur if the OSS didn’t exist. They then calculated the demand-side economic value q, by similarly estimating the replacement costs for each firm if OSS didn’t exist.
Based on these estimates, the supply side value of recreating all widely used OSS is around $4.2 billion, — ranging from $1.2 billion to $6.2 billion depending on a number of assumptions. At $8.8 trillion, the demand-side economic value based on estimates of the actual usage of OSS is much larger, — ranging from $2.6 trillion to $13.2 trillion.
In addition, the analysis found that the bulk of OSS code actually used by firms is created by a very small number of programmers. About 3,000 programmers, or 5% of the programmers around the world, generate over 93% of the supply-side economic value, and 96% of the demand-side value.
The industry with the highest demand-side economic value is Professional, Scientific, and Technical Services, followed by Retail Trade, and by Administrative and Support. Not surprisingly, non-service sector industries constitute a small portion of the demand-side usage value, including Mining, Quarrying, and Oil and Gas Extraction; Utilities; and Agriculture, Forestry, Fishing, and Hunting.
Finally, Hoffmann, Nagle, and Zhou explained that although they highlight the substantial value that OSS has in our society based on a wide swath of usage data, it is not feasible to identify 100% of the OSS used across the world and, as such, our demand-side estimates are likely an underestimate of the true value.
“In aggregate, our results show the substantial value that OSS contributes to the economy despite this value generally showing up as zero via direct measurement since prices equal zero and quantity is difficult to measure using public data alone. Our research lays the groundwork for future studies of not only OSS, but of all IT and its growing impact on the global economy.”
Comments