Sustainability

Apr 06

What’s in the SOSS? Podcast #58 – S3E10 Big Thoughts, Open Sources: Beyond the Hype: Brian Fox on Securing the Agentic Future of Open Source

By OpenSSF Podcast

Summary

In this inaugural episode of Big Thoughts, Open Sources, host CRob sits down with Brian Fox, Co-founder and CTO of Sonatype, to discuss the friction between rapid AI adoption and foundational software security. Brian shares insights from the 11th annual State of the Software Supply Chain Report, revealing the emergence of “slop squatting” and the high frequency of AI models recommending non-existent or vulnerable dependencies. The conversation explores how the Model Context Protocol (MCP) could revolutionize developer compliance and why the industry must fund the critical infrastructure supporting our trillion-dollar open source ecosystem.

Listen on Apple Podcasts Listen on Spotify Listen on Overcast Listen on Pocket Casts

Conversation Highlights

00:23 – Welcome: Big Thoughts, Open Sources inaugural episode.
01:01 – Brian Fox’s journey: Apache Maven, Sonatype, and OpenSSF.
02:53 – The critical role of Maven Central in the software supply chain.
03:26 – Decades of security trends: The persistent “Log4Shell” pattern.
05:34 – The “Tribal Knowledge” problem for AI agents.
07:06 – State of the Software Supply Chain Report: AI recommending made-up code versions.
08:09 – Explaining “Slop Squatting” and AI hallucinations.
10:03 – Model Context Protocol (MCP): Turning security tools into AI expert systems.
13:42 – Do not ignore 60 years of software engineering “physics”.
15:11 – The “Vulcan Mind Meld”: Injecting governance data into AI agents.
17:19 – Risks, rewards, and the need for ML SecOps discipline.
19:30 – “Inefficient code is still inefficient code”: Lessons from cloud migrations.
21:01 – Building an “AI-native SDLC” with upfront security.
24:18 – The sustainability crisis: Secure open source builds are not free.
27:17 – Conclusion: Funding open source infrastructure (8 trillion dollars of value).

Episode Links

Transcript

Crob (00:23)
Welcome, welcome, welcome to Big Thoughts, Open Sources, the OpenSSF’s new podcast. We’re gonna dive a little more deeply in with some of the amazing community members, subject matter experts, and thought leaders within open source, cybersecurity, and high technology. Today in our inaugural episode, I’m very pleased to welcome a friend of the show, Brian Fox from Sonotype. How you doing, Brian?

Brian Fox (00:47)
I’m doing well, how are you?

Crob (00:48)
Excellent, we’re super glad to have you today. So maybe just for our audience members that are unfamiliar with your work, could you maybe talk a little bit about how you got into open source and kind of what you specialize in in this amazing ecosystem?

Brian Fox (01:01)
How I got an open source, that’s a long conversation. geez, all the way back in 2002, 2003, I suppose, is when I really, really got involved. I had done some dabbling and some other things before that, but I got involved around that time in Apache Maven. I started writing some plugins.

They’re pretty popular plugins, people still use them these days. And those ultimately got pulled into the Apache project, the official project. I kind stowawayed and came in as a committer. A few years later, I joined up with some other folks that were also working on Maven and we co-founded Sonatype. It’s been 19 years now.

CRob (1:45)
Wow, that’s awesome.

Brian Fox (1:46)
Yeah, and so then I was ultimately the…the chair of the Apache Maven project for a long time. still an Apache member of the foundation. And then more recently, even though it’s been a while, what, four or five years now, we joined the OpenSSF. I’ve been on the Governing Board with you for a while. I’m also on the Governing Board of FINOS, which is the financial open source.

And for the last couple years, also been on the Singapore Monetary Authority’s Cyber
Experts Group. Yeah, that’s fun And so, you know, I’ve spent a lot of time focused on those things. One of the things that Sonatype does for the community we run Maven Central, right? Which for people that don’t know that’s where all the world’s open source Java components come from.

CRob

Yeah, it’s kind of sounds like a big job It is a big job running critical infrastructure for all that kind of stuff And so, you know over the years that’s given us really interesting insights into what’s going on with the supply chain so, you know, that’s kind of that’s sort of what led us to the path that brought me to OpenSSF and all those other things.

CRob (2:53)
Yeah, you and your team have been amazing participants and contributors to our community and just kind of even putting aside all the work with Maven. Just your kind of participation in our working groups and our efforts has been amazing. Yeah, thank you. So today I think you wanted to talk about a topic a lot of people probably haven’t heard about. This little thing called AI. I have a hard time spelling that.

Brian Fox (3:16)
Right?

CRob (3:20)
Let’s just set the stage. What are you thinking about? What do we want to have a conversation around AI about?

Brian Fox (3:26)
Yeah, I think so if we back up a little bit, right? So it was probably around 2011, 2012, I suppose. We started looking at some of the trends that we were seeing within the Maven central downloads. We were seeing things like the most popular version of a crypto library was the one that had a level 10 vulnerability.

fixed and patched years before, but that everybody was still using the vulnerable version. The log for J, log for shell pattern has existed basically forever. It’s not actually new. And so that led us down the path to start doing different things to help our customers A, understand what open source they were using. Way back then, nobody knew. They were like, we’re not using open source. What they really meant was, I don’t think I’m using Linux in open office. They didn’t understand.

that their developers were pulling in all these components. And so the problem space back then was helping them have visibility and then providing data and controls to help them better govern their choices. So we’ve always been trying to help expedite and make it more efficient for developers to make better choices. And so it’s interesting to see this development of AI and all of the kind of things that have come along with it. So that got me thinking, you know, what?

When we started out to build some of the stuff that we built for our customers, my focus at that time was to make it possible to do the analysis in real time so that it wasn’t the case that, we’re just going to do all our stuff and then we’re going to run a compliance scan at the end of the week or end of the month or something. So we were very focused on, it needs to be able to be run every single bill all the time. We need to be able to provide guidelines so that they don’t have to ask the legal team and wait six weeks for an answer, or the security team, right?

We were trying to capture those roles, or those rules, into the system so that they could make better choices in real time. And that was a big thing that organizations needed to be able to scale and become efficient. When you start dealing with thousands of developers, tens of thousands or millions of applications, the tribal knowledge problem kind of falls apart.

CRob (5:33)
Absolutely.

Brian (5:34)
Right? And so you start thinking about what happens with AI, and if you don’t have that stuff in an automated, you know, coded kind of way, how do you feed that to an agent? The agent’s not hanging out with you at lunch. It doesn’t get an onboarding session where we say things like, you know what, we never use an LGPL dependency because we ship our code. Or, you know, we only fix vulnerabilities five and above. Or, you know, whatever the policy may be, those things sometimes can be shared among developers.

CRob (6:02)
Right. and it plays into kind of the classic problem with engineering – is most engineers I’ve met don’t like doing documentation. And with AI entering the chat room becoming this accelerant, it’s making decisions based off of knowledge or lack thereof. if you don’t have your security policy documented, it even goes back to thinking about the early days of Kubernetes.

Where it was a big mental shift for people to have that software defined network inside. And that helped, I think, a lot of organizations get better discipline and rigor where you had less mysterious outages. Because the firewall guy in the back end said, I didn’t do anything, but try test it again.

Brian Fox (6:46)
Right, right. Yeah, for sure. And that’s kind of what we’re seeing now. We’re seeing a lot of that with the, not just with agents. mean, agents are sort of like the next big step and not everybody’s
fully there yet. Some people are dabbling with it. But even just AI assisted coding, you’re seeing the same problem that you come in and you say, hey, I want a new feature. And it just grabs whatever statistically likely thing dependency is going to be in there. We’ve done some studies. We recently released the state of the software supply chain report. It’s a great report. Yeah, thank you. This was our 11th year. We just published it last month. And we did a deep dive on AI recommendations, you know, and we found that 30 % of the time the models were recommending made up versions.

CRob (7:35)
What?

Brian Fox (7:36)
Yeah, just making them up. You know, so it’s kind of shocking. In the real world, you know, if you’ve got a tool, that’s one of those things that fails fast, right? Like it picks a version that doesn’t exist, the thing goes and it immediately blows up and then, you know, Claude or whatever you’re using will go, whoops, and it’ll fix it. So it’s kind of funny, burns some tokens, but the downsides aren’t huge.

If the agent randomly picks a terrible dependency or a very old one that does in fact exist, I would argue that’s worse because there’s no fail fast in that scenario.

CRob (8:09)
Well, you also have the whole problem with slop squatting. Where the models seem to, regardless of what vendor provider you’re getting it from, they seem to fairly consistently suggest the wrong dependencies, kind of like typo squatting.

And so now the bad guys have recognized this kind of fairly consistent behavior and they’re uploading malicious packages with those bad names so that you don’t break the build because it can find what needs.

Brian Fox (8:33)
So instead of it failing fast, it fails fast by grabbing a back door or something. Exactly, that’s exactly right. That’s what slop squatting is what they call it now. Yeah, and so those are some of the challenges that we observe and you kind of take it to the extreme where now you potentially have less sophisticated developers, not classically trained developers using these tools, and they don’t know what they don’t know.

They wouldn’t necessarily stop to say, hey, I want you to now be a security expert and do an assessment of the code you just created. Like somebody who knows better will do that. But if you’ve not lived through the pain that you and I have lived, you wouldn’t think about that. And so on average, these things are going to potentially toss away a lot of the learnings that we’ve known for so many years.

CRob (09:21)
And that’s been a chronic challenge, trying to get the tribal knowledge instantiated, trying to help people make those right decisions. And the AI tools are amazing productivity and efficiency savers, but they are bringing in, as you said, classically untrained professionals that they are not a software engineer. They don’t understand how a system should be architected, or they don’t understand kind of the app sec best practices that help secure the foundation of everything and not let the world fall apart.

Brian Fox (9:59)
The interesting thing is I think they can be if prompted correctly.

CRob (10:02)
Yes.

Brian Fox (10:03)
Right? And that’s where some of the knowledge gap comes in. And I think, what was it last summer, Anthropic released the MCP model control protocol, right? Which is, I’ve spent a lot of time thinking about that pretty deeply and looking at all the tools. And I wrote about this. I think that there’s a high likelihood that we see a lot of the tools we use in software, in the SDLC today, moving more towards providing their capabilities as subject matter experts in “a thing” to an AI agent via MCP.

So I think that, for me, is pretty exciting for a number of reasons. It means, as a tool vendor, I don’t have to create a plug-in for IntelliJ and one for Eclipse and one for VS Code. As an example, MCP can be the same thing for I don’t care what tool you’re using, because it’s interacting with me via this standard API. And I’m kind of talking to it in more or less English prompts. So my ability to deliver the value that we have into whatever tool you feel like using today, and they change every week, is pretty cool.

And I would also argue that the ability to insert that information and to potentially roll out the root prompting that all of the developers are using in these capabilities is better. You’re going to get potentially better compliance than you do today. One of the things I struggled with forever was we created an IDE plugin for our capabilities that it demoed amazingly well. It showed, hey, this dependency has vulnerabilities, or license, or would make recommendations. It was great. But developers just didn’t want to install more plugins. They just weren’t using it, right?

So while it demoed well, the actual usage of it was very low for compliance reasons. That’s a thing we struggle with. Every tool vendor struggles with that. But if you were able to insert that same information into an MCP capability and the company rolls out a root prompt that says something like, hey, every time you’re choosing a new project or a new dependency or trying to assess a dependency, use this MCP server to get up-to-date real-time information, it’s more or less going to do that every time. Right.

CRob (12:13)
Yeah, and I think back to like when I was a baby cyber person going studying for my CISSP, there was a lot of talk in the exam materials about expert systems, which is exactly what I think a best case scenario with these tools can be. It’s you’re expert. I don’t have to necessarily have this expertise. That’s right. But thinking about it takes a lot of knowledge to craft these expert systems. Let’s talk about how some of these models have been trained on potentially less than expert data.

Brian Fox (12:43)
Right, and that’s just, think, the inevitable nature that the frontier models have been trained on, you know, all the stuff that they can find.

CRob (12:49)
The internet.

Brian Fox (12:49)
On the internet, good information, bad information, people talking about terrible dependencies a lot might statistically make that more of recommendation, right? And I think that can be okay as long as you’re plugging in the models that have real data. The things that we’ve seen, you know, when we assess the models is that like I said, they make up versions, they pick old versions arbitrarily, they don’t know about anything newer than when they were last trained, which means both new versions and also vulnerabilities that might be an older version.

So they’re inadvertently recommending, and it’s not even a recommendation really, it’s just using it, right? It’s putting it in there and writing the code around it. Imagine picking Spring, right? It’s just going to go, I’m going to write a Spring app and I’m going to use all Spring 5.

And then when you probe it, then it’s like, oh, sorry, I have to do two major framework updates. You almost have to throw it away and start over. And so if you’re able to plug the right data in up front, you don’t have all of that waste. And again, if you have people who don’t know to prompt it to ask about the latest versions, you can insert that underneath the hood. I think that’s what’s really cool.

Brian Fox (13:59)
But what we’re seeing currently, I kind of wrote about this a little bit too, that I feel like we’re throwing out all the lessons of the past. We’re talking about situations where whole tools, SAST is under fire right now, right? Because when all the code can be just completely generated, what’s the problem with SAST? But I do think that we’ve learned a lot of things over the years if we can figure out how to plug those capabilities into what’s being generated.

I think we can bring all of that forward with us. But the entire SDLC is going to have to adapt to that. It’s not going to be sufficient to say, I’ve got a bunch of developers over here. They’re doing AI assisted development. And then later, we’re going to run a bunch of SAS and produce legacy reports. That’s not going to work. The information has to be fed directly into the AI capabilities up front.

CRob (14:51)
And it’s the classic problem we’ve always had, where security historically is the the last thing done, addressed, it’s bolted on at the end in a lot of cases. And just this AI tooling and just the velocity it has is a huge accelerant for the sins of our past we’ve never actually addressed.

Brian Fox (15:11)
Absolutely, but it also provides the Vulcan mind meld if you want to think of it. You now have that opportunity to plug that right into what the agent is thinking about in the moment. You can’t do that with the humans, but you can do that with the agent. And that’s what I think is potentially exciting about this.

Where I described it recently at a summit, we’re sort of in a bootstrap situation, though, right? Like, we don’t have all of those capabilities. Organizations haven’t rolled them all out. And so we’re sort of in this weird situation, one foot on the boat, one foot on the dock, and it’s not going to end well as we’re going through it. And worse, there are people that are afraid of the MCP protocol. So I hear lots of organizations say, we just block it completely.

Yeah. It’s a little hard to argue that that’s not a reasonable place to start because of the nature of what’s happening. We saw just the other day the latest version of Shai-Hulud came out. Did you see this? And they used MCP capabilities as data exfiltration. And I’m like, come on, guys. There’s so much power in this, but now you’re making it like a bad thing. So I think the industry and the tools and all of that are going to have to work through governance of the MCP capabilities, sanitization inspection of the MCP capabilities just like we’ve seen. So it’s sort of one of these things like when you’ve been around long enough you can recognize the patterns. It’s new and exciting but also the pattern rhymes with a bunch of stuff we’ve done before and what frustrates me is that like everybody charges ahead so fast they just feel like it’s all new it’s all different it’s like yes but let’s not forget everything we’ve learned over the last 60 years of software engineering because the physics is still the same.

CRob (16:50)
Well, and that’s where so our AI / ML working group wrote a paper around ML SecOps. And the paper was really interesting. I recommend the audience check it out. But it was they talked about classic techniques that are assisted and are helpful with AI development. And then it talked about some gaps where we have things like are not documented policies that are kind of a hindrance and something that’s an opportunity in the future to try to get addressed.

Brian Fox (17:18)
Yeah.

CRob (17:19)
But…I’m of two minds about my friends, our new robot overlords, in that it can be extremely helpful, but I don’t see a lot of people reconsidering those lessons of the past of software engineering. To say this is all brand new and totally different, like, well, you’ve got different GPU accelerators and dedicated cores to do things.

And now with this like agentic and ADA architecture where things are more highly distributed, yeah, that’s new twists, but it’s not brand new. We’ve done networking. We’ve done composite applications for decades.

Brian Fox (18:01)
Right. It’s the same thing, you know, we saw when, you know, we were like, oh, everything should be serverless or let’s go to the micro architecture, micro architecture, micro service architecture is going to solve everything until it doesn’t. Right.

Or, you know, that’s no problem, we’ll just put it in the cloud because I can just infinitely scale my machines, right? So I see the same pattern all again, that we sort of say, yes, but this time is different because insert new technology, and then we realize, yes, but everything we know is still true. And that’s what I think we’re sort of grappling with right now as we go through this. What is absolutely true is that, you know, the AI capabilities, the agents, all these things are making everything happen so much faster. That can be good.

can also be bad. If you’ve forgotten all the lessons of the past, you’re just going to create a ton of crap much faster than you could before. And by the time you realize it, it might be too late.

CRob (18:57)
I’m familiar with a lot of enterprises that were going through a digital transformation journey, trying to update their heritage software to newer things and to the cloud to get that scalability and cost efficiency. But a lot of organizations didn’t take that journey, didn’t learn from lessons from the past.

they just crammed what they had out in the cloud, and then a month later they get this giant bill and they’re shocked and confused, or they didn’t understand that this thing wasn’t architected for zero trust, and they’re leaking data everywhere.

Brian Fox (19:30)
Right, right. Or that, or just even the performance reasons why you were excited to infinitely scale, sure, but somebody’s not excited to infinitely pay a bigger bill. Inefficient code is still inefficient code, right? And that’s what I think we’re gonna see with… with AI capabilities is just going to happen faster. And without humans in the loop, it provides less opportunities for us to course correct, which is why I’ve been taking a step back and thinking about how do we do that? How does it make sense? I think for some of the stuff that we’ve been doing as a business, it’s really exciting because we have built up really interesting, unique data sets based on being able to see everything going on with Maven Central. We’ve long had Nexus, the repository manager that’s out there.

We have hundreds of thousands of instances. Those things are proxying for enterprises, not just Maven, but NPM, Nougat, Python, all the things. And so that gives us visibility into other ecosystems so we can understand what’s going on, what’s commonly used in enterprises, these kinds of things. And so all of that data can be fed now directly, like I said, the Vulcan mind meld directly into these tools. And it makes it a lot easier.

So in some ways, when we sort this out, and people become less afraid of MCP capabilities, we can directly inject a stream of high quality data to make all of those things better. But, before businesses can really leverage that, they have to get out of the experimentation phase. They have to roll that out. And these things are kind of interrelated. What we see is that organizations are afraid to let developers just go with AI assisted development because it’s not governed, because they can’t govern it.

And those are echoes of what I saw firsthand during the early days of open source. Like I said in the beginning, people said, we’re not using it. And then I’d tell them, yes, you are. And then their reaction was like, well, just shut it all out. It’s like, right, you can’t do anything. So the reaction that some enterprises have right now of like, we’re just not going to do anything with AI, is just setting themselves up to be left behind.

The right answer is to do it thoughtfully and use tools to help them make better decisions.

CRob (21:43)
So reflecting back, mentioned in your report that you have some guidance for people around AI. What would the top two or three things, if somebody’s thinking about moving more aggressively in this AI direction, what can they take away and do immediately or start thinking about?

Brian Fox (21:01)
Yeah, I mean, think the biggest thing is humans like to…try to take the old patterns and just adopt it to the new new technologies like we were talking about take an inefficient architecture and throw it in a cloud It’s gonna fix everything. No, it’s not and I think that’s true of Let’s call it the AI SDLC right an AI native SDLC Might resemble a normal SDLC, but it should be designed differently, right?

You know trying to do the checks and balances after the fact is even worse than it was with humans You need to think about providing that information upfront so that you get the value in the creation of the code and not try to chase it out. You need to be able to think about how all of these things can be done in parallel with agents, breaking these things down. what I would say is, don’t just try to do what you’re doing today and use AI to do it. Take a step back and really assess how can you adopt this.

It’s sort of like the conversations we were having in the board today about developers, maintainers are getting overwhelmed with AI slop. It’s true. A reaction is to stop allowing that to be contributed, just dismiss everything AI. That’s not a good answer. A better answer is let’s figure out how to help them use AI tools to be able to keep up with that, right? Because that’s what it’s good for. It can review and assess the patches faster than the maintainers and then provide sort of a first pass filter, if you will, right?

But that requires thinking outside of the box. Don’t just try to keep doing what you’re doing and try to keep up with it. Think about how you judo move that into something that makes more sense for your organization.

CRob (23:42)
And this skirts along another kind of project you’re passionate about, sustainability and funding. It is one thing to try to admonish the developers, why aren’t you using AI? But there are real costs involved around this. And, just to say, well, you should use the tool that doesn’t help them when there’s no funding. They don’t have access to infrastructure to be able to do these things. And that’s like, think, it touches on your passion project around trying to help get the package repositories more sustainably funded.

Brian Fox (24:18)
That’s right. Yeah, I mean, if you take a step back and you think about open source when probably you started, certainly when I started, what that really meant was you were donating your time. And you were sharing your thoughts, and you were sharing your words via code. And that was in a time when it was perfectly acceptable. In fact, it was the only choice that you built things and you shipped them off of your laptop.

There was when the Apple MacBook Air launched, the first one. That launched with a version of Maven on it that was signed by my key, my personal key, that was built on my personal laptop. So everybody that bought the launch version of the MacBook Air had my signed code on it. That’s kind of cool.

But also kind of scary, right, when you think about it. Like, what if my laptop was compromised? And that’s the world we live in today. Fortunately, in 2009 or whatever it was, that was a little bit more remote of a chance. And so everybody thinks like, well, that’s crazy. You wouldn’t do that anymore. So what does it mean today? It means you have to have certified builds. Usually that means it’s running in the cloud, and it’s attested to, and all these kinds of things. And that’s not free.

Like I can’t donate that, I’m not a hyperscaler. Most open source gets that infrastructure donated by these big companies, but there’s a lot of opportunity for abuse, right? And these types of things. it’s just, at the end of the day, it’s not free. So the cost of producing open source is not free anymore. It’s not just donating my time with equipment and internet access I already have, right? That’s the big difference. And I think people don’t really recognize that and now fast forward to what we’re just talking about AI the obvious answer to deal with AI, you know Piled on PRS is to have AI assistance help.

Who’s gonna pay for that? It’s literally not free. It costs electricity, last time I checked we still pay for our electricity Regardless

CRob (26:12)
Electricity, water…

Brian Fox (26:14)
Right all of these things, right? These are very…they have very real implications. They’re just not free and so There’s no good answer to that. How does that get aligned? How do we…how do we continue to create open source software that can power all of these industries in a world where it’s not just somebody donating their time and thoughts? There are no good answers. But we’re working towards trying to align that. Because the bulk of open source software, certainly in our world, in these areas, is being consumed by organizations that are selling for-profit software, more or less.

There’s definitely a lot of hobbyists and stuff like that the biggest consumers from our repositories are all the giant companies. I’ve named the top 100. You would know every single one of them. And I’m sure that’s true for all the registries. So there has to be an answer in there. I don’t know the stat off the top of my head, but the Linux Foundation does the census, right? And it’s billions of dollars of economic value that open source creates. Eight billion? Nine billion?

CRob (27:16)
Trillion.

Brian Fox (27:17)
Oh, it’s a trillion now?

CRob (27:17)
It’s eight (8) trillion, I believe.

Brian Fox ()
Eight (8) Trillion dollars worth of economic value being produced by open source…1 % of that would pay for a lot of that infrastructure, and then a whole bunch more. And so I think that’s what ultimately we have to figure out how to balance. AI just makes that worse, because it moves the bar even further.

CRob (27:39)
Interesting conversation. Any final thoughts you want our listeners and viewers to take away?

Brian Fox (27:46)
Well, certainly go take a look at The State of the Software Supply Chain Report.

CRob (27:51)
Great report.

Brian Fox (27:52))
sonatype.com/SSCR Certainly, I’ve also written a number of blogs. You can find those at our website as well. That deep dive, kind of all these topics we touched on here. Yeah.

CRob (28:02)
Excellent. We’ll put some links as we do our summary. So Brian, thank you for our inaugural episode of Big thoughts, Open Sources. I think this was an amazing conversation that we’re gonna continue to be adding onto and reconsidering in the coming weeks and months.

Brian Fox (28:20)
Yeah, thanks for having me kick it off in Napa.

CRob (28:25)
Thank you. Well, I hope everybody stays cyber safe and sound. We’ll talk to you soon.

Sep 23

Love0

Open Infrastructure is Not Free: A Joint Statement on Sustainable Stewardship

By OpenSSF Blog

An Open Letter from the Stewards of Public Open Source Infrastructure

Over the past two decades, open source has revolutionized the way software is developed. Every modern application, whether written in Java, JavaScript, Python, Rust, PHP, or beyond, depends on public package registries like Maven Central, PyPI, crates.io, Packagist and open-vsx to retrieve, share, and validate dependencies. These registries have become foundational digital infrastructure – not just for open source, but for the global software supply chain.

Beyond package registries, open source projects also rely on essential systems for building, testing, analyzing, deploying, and distributing software. These also include content delivery networks (CDNs) that offer global reach and performance at scale, along with donated (usually cloud) computing power and storage to support them.

And yet, for all their importance, most of these systems operate under a dangerously fragile premise: They are often maintained, operated, and funded in ways that rely on goodwill, rather than mechanisms that align responsibility with usage.

Despite serving billions (perhaps even trillions) of downloads each month (largely driven by commercial-scale consumption), many of these services are funded by a small group of benefactors. Sometimes they are supported by commercial vendors, such as Sonatype (Maven Central), GitHub (npm) or Microsoft (NuGet). At other times, they are supported by nonprofit foundations that rely on grants, donations, and sponsorships to cover their maintenance, operation, and staffing.

Regardless of the operating model, the pattern remains the same: a small number of organizations absorb the majority of infrastructure costs, while the overwhelming majority of large-scale users, including commercial entities that generate demand and extract economic value, consume these services without contributing to their sustainability

Modern Expectations, Real Infrastructure

Not long ago, maintaining an open source project meant uploading a tarball from your local machine to a website. Today, expectations are very different:

Dependency resolution and distribution must be fast, reliable, and global.
Publishing must be verifiable, signed, and immutable.
Continuous integration (CI) pipelines expect deterministic builds with zero downtime.
Security tooling expects an immediate response from public registries.
Governments and enterprises demand continuous monitoring, traceability, and auditability of systems.
New regulatory requirements, such as the EU Cyber Resilience Act (CRA), are further increasing compliance obligations and documentation demands, adding overhead for already resource-constrained ecosystems.
Infrastructure must be responsive to other types of attacks, such as spam and increased supply chain attacks involving malicious components that need to be removed.

These expectations come with real costs in developer time, bandwidth, computing power, storage, CDN distribution, operational, and emergency response support. Yet, across ecosystems, most organizations that benefit from these services do not contribute financially, leaving a small group of stewards to carry the burden.

Automated CI systems, large-scale dependency scanners, and ephemeral container builds, which are often operated by companies, place enormous strain on infrastructure. These commercial-scale workloads often run without caching, throttling, or even awareness of the strain they impose. The rise of Generative and Agentic AI is driving a further explosion of machine-driven, often wasteful automated usage, compounding the existing challenges.

The illusion of “free and infinite” infrastructure encourages wasteful usage.

Proprietary Software distribution

In many cases, public registries are now used to distribute not only open source libraries but also proprietary software, often as binaries or software development kits (SDKs) packaged as dependencies. These projects may have an open source license, but they are not functional except as part of a paid product or platform.

For the publisher, this model is efficient. It provides the reliability, performance, and global reach of public infrastructure without having to build or maintain it. In effect, public registries have become free global CDNs for commercial vendors.

We don’t believe this is inherently wrong. In fact, it’s somewhat understandable and speaks to the power of the open source development model. Public registries offer speed, global availability, and a trusted distribution infrastructure already used by their target users, making it sensible for commercial publishers to gravitate toward them. However, it is essential to acknowledge that this was not the original intention of these systems. Open source packaging ecosystems were created to support the distribution of open, community-driven software, not as a general-purpose backend for proprietary product delivery. If these registries are now serving both roles, and doing so at a massive scale, that’s fine. But it also means it’s time to bring expectations and incentives into alignment.

Commercial-scale use without commercial-scale support is unsustainable.

Moving Towards Sustainability

Open source infrastructure cannot be expected to operate indefinitely on unbalanced generosity. The real challenge is creating sustainable funding models that scale with usage, rather than relying on informal and inconsistent support.

There is a difference between:

Operating sustainably, and
Functioning without guardrails, with no meaningful link between usage and responsibility.

Today, that distinction is often blurred. Open source infrastructure, whether backed by companies or community-led foundations, faces rising demands, fueled by enterprise-scale consumption, without reliable mechanisms to scale funding accordingly. Documented examples demonstrate how this imbalance drives ecosystem costs, highlighting the real-world consequences of an illusion that all usage is free and unlimited.

For foundations in particular, this challenge can be especially acute. Many are entrusted with running critical public services, yet must do so through donor funding, grants, and time-limited sponsorships. This makes long-term planning difficult and often limits their ability to invest proactively in staffing, supply chain security, availability, and scalability. Meanwhile, many of these repositories are experiencing exponential growth in demand, while the growth in sponsor support is at best linear, posing a challenge to the financial stability of the nonprofit organizations managing them.

At the same time, the long-standing challenge of maintainer funding remains unresolved. Despite years of experiments and well-intentioned initiatives, most maintainers of critical projects still receive little or no sustained support, leaving them to shoulder enormous responsibility in their personal time. In many cases, these same underfunded projects are supported by the very foundations already carrying the burden of infrastructure costs. In others, scarce funds are diverted to cover the operational and staffing needs of the infrastructure itself.

If we were able to bring greater balance and alignment between usage and funding of open source infrastructure, it would not only strengthen the resilience of the systems we all depend on, but it would also free up existing investments, giving foundations more room to directly support the maintainers who form the backbone of open source.

Billion-dollar ecosystems cannot stand on foundations built of goodwill and unpaid weekends.

What Needs to Change

It is time to adopt practical and sustainable approaches that better align usage with costs. While each ecosystem will adopt the approaches that make the most sense in its own context, the need for action is universal. These are the areas where action should be investigated:

Commercial and institutional partnerships that help fund infrastructure in proportion to usage or in exchange for strategic benefits.
Tiered access models that maintain openness for general and individual use while providing scaled performance or reliability options for high-volume consumers.
Value-added capabilities that commercial entities might find valuable, such as usage statistics.

These are not radical ideas. They are practical, commonsense measures already used in other shared systems, such as Internet bandwidth and cloud computing. They keep open infrastructure accessible while promoting responsibility at scale.

Sustainability is not about closing access; it’s about keeping the doors open and investing for the future.

This Is a Shared Resource and a Shared Responsibility

We are proud to operate the infrastructure and systems that power the open source ecosystem and modern software development. These systems serve developers in every field, across every industry, and in every region of the world.

But their sustainability cannot continue to rely solely on a small group of donors or silent benefactors. We must shift from a culture of invisible dependence to one of balanced and aligned investments.

This is not (yet) a crisis. But it is a critical inflection point.

If we act now to evolve our models, creating room for participation, partnership, and shared responsibility, we can maintain the strength, stability, and accessibility of these systems for everyone.

Without action, the foundation beneath modern software will give way. With action — shared, aligned, and sustained — we can ensure these systems remain strong, secure, and open to all.

How You Can Help

While each ecosystem may adopt different approaches, there are clear ways for organizations and individuals to begin engaging now:

Show Up and Learn: Connect with the foundations and organizations that maintain the infrastructure you depend on. Understand their operational realities, funding models, and needs.
Align Usage with Responsibility: If your organization is a high-volume consumer, review your practices. Implement caching, reduce redundant traffic, and engage with stewards on how you can contribute proportionally.
Build With Care: If you create build tools, frameworks, or security products, consider how your defaults and behaviors impact public infrastructure. Reduce unnecessary requests, make proxy usage easier, and document best practices so your users can minimize their footprint.
Become a Financial Partner: Support foundations and projects directly, through membership, sponsorship, or by employing maintainers. Predictable funding enables proactive investment in security and scalability.

Awareness is important, but awareness alone is not enough. These systems will only remain sustainable if those who benefit most also share in their support.

What’s Next

This open letter serves as a starting point, not a finish. As stewards of this shared infrastructure, we will continue to work together with foundations, governments, and industry partners to turn principles into practice. Each ecosystem will pursue the models that make sense in its own context, but all share the same direction: aligning responsibility with usage to ensure resilience.

Future changes may take various forms, ranging from new funding partnerships to revised usage policies to expanded collaboration with governments and enterprises. What matters most is that the status quo cannot hold.

We invite you to engage with us in this work: learn from the communities that maintain your dependencies, bring forward ideas, and be prepared for a world where sustainability is not optional but expected.

Signed by

Alpha-Omega

Continuous Delivery Foundation (CDF)

Eclipse Foundation (Open VSX)

OpenJS Foundation

Open Source Security Foundation (OpenSSF)

Packagist (Composer)

Perl and Raku Foundation

Python Software Foundation (PyPI)

Ruby Central

Rust Foundation (crates.io)

Sonatype (Maven Central)

Organizational signatures indicate endorsement by the listed entity. Additional organizations may be added over time.

Acknowledgments: We thank the contributors from the above organizations and the broader community for their review and input.