Irving Wladawsky-Berger

A collection of observations, news and resources on the changing nature of innovation, technology, leadership, and other subjects.

ABOUT

In May of 2025, Linux Foundation Research, the Eclipse Foundation, and the open source community GOSIM jointly organized the Open Source AI Strategy Forum in Paris. The Forum aimed to address the critical challenges facing collaboration in open source AI with global experts from industry, academia, open source communities, and foundations. Two months later they published “Charting Strategic Directions for Global Collaboration in Open Source AI,” a report that summarizes the key highlights and takeaways from the forum.“It is a common saying that software is eating the world, a phrase that venture capitalist Marc Andreessen coined in 2011 to describe how software systems are disrupting traditional industries across the global economy,” wrote LF senior researcher Cailean Osborne in the report’s Introduction. “In 2022, upon observing the accelerating growth rate of market capture by open-core companies versus closed-core companies, venture capitalist Joseph Jack extended Andreessen’s metaphor, remarking that open source is eating software faster than software is eating the world.”

“Nowadays, we may be tempted to make a similar argument about AI, with open source AI solutions catching up with the capabilities of proprietary AI solutions from large language models (LLMs) to AI agents, and offering an alternative approach to AI development and governance rooted in the principles of open science and innovation,” the report added. “The open source AI ecosystem has undergone meteoric growth in recent years. There are now over 1.5 million models on the Hugging Face Hub, and some have hundreds of millions of downloads. Those models are steadily catching up with the capabilities of proprietary models.”

How does the original promise of the open source movement now apply to AI? This is a question I’ve been quite interested in given my personal involvement with open source initiatives, — first helping to organize IBM’s Linux initiative in the 2000s, and for the past four years as a member of the Advisory Board of Linux Foundation Research.

The definition of open-source software (OSS) has been generally accepted for about 25 years: “OSS is software with source code that anyone can inspect, modify, and enhance.” The Open Source Definition points out that beyond access to the source code, OSS must comply with a few additional criteria, including free redistribution and derived works. Many of the original promises and accomplishments of the open source software movement have now been projected onto open AI, from the promise that open source could democratize the development of AI systems to the perspective that open source levels the playing field, allowing the most innovative to triumph.

But open AI is quite different from OSS because AI systems don’t behave like traditional software. To begin with, openness in AI is a hard concept to define, in part because AI itself is not clearly defined, and neither is what ‘open’ means when dealing with highly complex systems like AI. To this day, there is no universally agreed definition of ‘open AI’ or ‘open source AI’ even as attention on the topic has exploded. Open AI systems require distinct definitions, protocols, and development processes from those used in open source software.

A major difference, in my opinion, is that while software has played a central role in the evolution of IT systems, data has been playing the central role in the advances of AI over the past few decades.

Against this backdrop, forum participants explored five key topics in their panels:

  • Fostering global collaboration in open source AI: How can the international community work together on open source AI research and development despite growing geopolitical tensions and regional divergence on AI governance?
  • Open source AI and digital sovereignty: How can governments strengthen their digital sovereignty without fragmenting the global open source ecosystem? 
  • Open source AI research and reproducibility: What role does open source play in scientific discovery and reproducible research? 
  • Open source AI adoption challenges in enterprise settings: What are the barriers to enterprise adoption of open source AI solutions, and what could facilitate this adoption?
  • Ensuring responsible practices in open source AI: How do we ensure ethical and responsible development practices in open source AI innovation?

Let me summarize a few of the forum’s key findings for each of these topics.

Fostering global collaboration in open source AI 

Building international consensus on the definition of open source AI. The panelists agreed that developing a common understanding of open source AI is imperative in light of the divergent practices in the openness and licensing of AI models, “with most ‘open’ models failing to uphold the four freedoms of open source (i.e., use, study, modify, and redistribute).”

Geopolitical and regulatory challenges for open source AI collaboration. Panelists expressed concerns about the risk of regulatory fragmentation hindering the sharing of AI models, software, datasets, and knowledge. “These risks underline the need for international cooperation on AI regulation.”

Frameworks for global collaboration in open source AI. “Collaboration via open source represents a means of keeping up with the pace of AI innovation, providing open development processes and governance frameworks that can help deal with unknown unknowns as well as to manage known unknowns as they emerge.”

The economic challenges of open source innovation. The costs of AI development are significantly higher than those of traditional software development. It’s not clear that the business models that work for open source software companies can sustain open source AI enterprises.

The promise of open source AI and digital sovereignty

Joint infrastructure investments. The panelists underlined open source “as a means for governments to simultaneously strengthen their digital sovereignty and foster global collaboration in AI, as well as a means of diplomacy between AI researchers and developers in the changing geopolitical landscape.”

Openness as policy. Governments should embrace “openness as policy” to increase their global competitiveness in AI. This entails leveraging “five interconnected pillars of open source AI — open science, open standards, open source, open data, and open weights — as strategic tools to level the playing field in AI R&D and commercialization, build an interoperable, open technology ecosystem, and ultimately create a collective counterbalance to the market power of global technology giants.”

Open source AI research and reproducibility

Open source is fundamental to reproducibility in AI research.  Panelist highlighted that today, “most LLMs and their underlying training processes and data remain closed, which severely limits the possibilities for scientists to independently verify claims about model performance, investigate potential biases, or build upon existing research in a rigorous manner. They argued that the entire pipeline of AI models — from dataset composition to model training to evaluation must be open to achieve true reproducibility.”

Openness is key to AI adoption in scientific discovery. While foundation models have the potential to facilitate scientific discovery in medicine, chemistry, biology, and other fields, “their black-box nature and lack of openness often deter researchers from applying them in their research.” Panelist urged the research community “to establish a virtuous cycle of collecting and sharing real and synthetic research data, cultivating AI skills among scientists, and strategically identifying research problems where the adoption of foundation models can benefit scientific discovery.”

Open source AI adoption challenges in enterprise settings

The challenges of enterprise adoption of open source AI solutions. Panelists pointed out that “a significant gap exists between the performance of models on benchmarks achieved in controlled research settings and their actual readiness to be deployed in production.” This gap represents one of the most significant barriers to the widespread adoption of open models by enterprises.

Building trust and ensuring the reliability of AI solutions remains a crucial challenge. Enterprises must be confident that open AI solutions can be trusted and comply with safety standards and regulatory requirements before widespread adoption.

Promoting responsible development practices in open source AI

Translating ethical principles into practice. Panelists pointed out that open source increases the transparency of AI processes, “and the openness of AI technologies, such as software and models, enables distributed auditing for potential biases and vulnerabilities, in turn  enhancing trust and safety in these technologies.”

The role of regulation in fostering responsible open source AI development. A key goal of regulation should be to shape safe and responsible innovation.

Demonstrate and document: Leading by example in the open source community. Panelists also argued that open source developers should “lead by example by documenting their responsible practices in a way that enables others to learn from and build on.”

Conclusion

Finally, panelists made the following recommendations to help navigate these challenges and advance the democratization of open source AI:

  • Championing openness: Open source AI developers should champion open development practices and licenses that enable the use, study, modification, and sharing of AI models, code, data, and documentation. 
  • Fostering global collaboration in AI: Vendor-neutral foundations play a key role in facilitating collaboration among global developers and enterprises on open source tools and open standards across the entire AI stack, from infrastructure to applications. 
  • The open source AI promise for digital sovereignty: Governments should embrace open source AI as a strategic tool to foster research, innovation, and security in AI while mitigating the fragmentation of the global open source ecosystems.
  • Enabling reproducibility and research: AI researchers should establish open standards for reproducibility in AI, including frameworks for AI training processes and data provenance. Greater transparency not only democratizes access to AI but also enables research and applications in the public interest. 
  • Facilitating enterprise adoption of open source AI: Expand open source evaluation frameworks and benchmark suites for testing and monitoring model performance and safety for diverse tasks and real-world contexts across regulated industries.
  • Promoting responsible AI practices: Promote the development and sharing of open source evaluation frameworks, benchmarks, and documentation that facilitate the learning, adoption, and innovation of best practices while providing tools that can help developers comply with regulations.
Posted in , , , , , , , , ,

Leave a Reply

Discover more from Irving Wladawsky-Berger

Subscribe now to keep reading and get access to the full archive.

Continue reading