Summary
Jay White, a leader in the open source ecosystem at Microsoft, discusses his journey into open source, focusing on AI and machine learning. He highlights his role in the Azure office of the CTO, working on open source, security, and AI standards. White emphasizes the importance of model signing and transparency in AI development, mentioning ongoing work in the OpenSSF and Coalition for Secure AI (CoSAI). He encourages community involvement, noting the need for standardization in AI supply chain security and the nuanced challenges of cultural representation in AI models. White also shares his passion for community building and the importance of continuous learning in AI and machine learning.
Conversation Highlights
Introduction & Jay’s Background (00:19)
Jay’s Journey into Open Source (02:29)
AI & Machine Learning Working Group (06:32)
Supply Chain Security & Model Signing (09:17)
Joining & Contributing to Open Source Efforts (13:16)
Challenges and Opportunities in AI Security (15:39)
Building Inclusive & Diverse AI Systems (18:30)
Rapid Fire & Final Thoughts (21:18)
Transcript
Yesenia Yser 00:19
Hola and welcome to What’s in the SOSS, OpenSSF’s podcast where we talk to interesting people throughout the open source ecosystem, sharing their journey, experiences and wisdom. So Yesenia, one of your hosts, and today we have an inspirational figure, one of one of my close colleagues and a leader in the open source worlds. Warm welcomes to Jay White, very excited to spend this time exploring artificial intelligence and machine learning in the open source Jay, can you introduce yourself to the audience?
Jay White 00:48
Absolutely. I’m actually really excited to be here. This has been a long time coming. I am Jay white. I work in the Azure office of the CTO on the open source strategy ecosystem team. I’ve been a member of that team for just over three years. I have the wonderful opportunity of having some real agency. Man, I work across a lot of different areas, including open source, security strategy, supply chain security strategy, AI security strategy, AI standards, AI security standards and stuff. I do so many different things and wear so many different hats. The context switching that I do during the day blows my mind. Sometimes I do want to say that the my views and my opinions as I talk about the great work I do in the open source community and how we do it here, those opinions are my own, and not those of a one Microsoft. So I did want to say that up front. Um, like I said, very excited to be here, and very excited to to talk about my journey, and, of course, some of the stuff that’s happening in my world now.
Yesenia Yser 02:09
Thanks Jay. And I’ve heard your origin story a few times on our bear community calls honestly, it’s one of my favorite journeys. Could you take us back to the beginning, like what first drew you into open source world, and how did that path evolve into your current focus of AI and machine learning and all the other hats you wear?
Jay White 02:29
Yeah, so you, you’ve heard me many times refer to this journey as a love story. You know, I got, I got into open source doing risk assessments, right understanding, you know compliance licensing, you know inbound and outbound of open source components, and then understanding different ways to consume open source, especially for organizations that are building software and services. So I got into it from a risk advisory standpoint. I’ve always been a long time community builder. So one of the things I was very successful with early on is even during my days in the military and and then after that, my days in consulting, I did a very good job of bringing people into a room that, by that wouldn’t speak to each other under normal circumstances, teams that just would not communicate one another. And it’s not because they didn’t know they could, it’s just that their work didn’t, you know, cross over. And I was able to see how the how those bridges could be built, and bring those people into rooms, and actually really get a lot of great ideas, concepts that come out, great ideas that came to life, and then get pen to paper to kind of bring these ideas and concepts to life. So I was brought in to Microsoft for that, and that alone, I believe, the ability to bridge build, what I found was a community of individuals that really loved just producing cool stuff, um, what? So that, so that’s what I found. What I understood was the um business driver that open source actually, because that what at that time was becoming, now we all understand is, but that business driver that it was becoming and saying, oh, man, yeah, people here really love building some cool stuff, but I really know how open source communities can be exploited. Um. Towards getting for profit organization stuff built crowd sourced wise, you begin to understand open source as being a tool and a mechanism for these kind of things, and how do you maintain its purity, while at the same time utilizing all of its different features towards getting cool stuff built that’s also wildly accepted by the industry, while also meeting, you know, impact and objectives and in alignment with the organization that you come from. And then, you know, meeting some cool people, making some cool partnerships and doing some cool stuff. So, so that that I mean that that’s very short for this, and Yesenia, you’ve heard me talk about my story kind of at length, is really following a bouncing ball, but that’s the net, net on how I got here and why I’m just in love with the work I do and how I do it.
Yesenia Yser 05:53
Yeah, definitely they need to make a movie on your love story, because I’ve heard it a few times. It’s a great one, honestly. So if you’re more interested, we’ll figure out which bear call it was. There’s been a few, but yeah, and you know, you wear so many hats. You’re part of the open source community. You especially in open SSF. You’ve worked with several different working groups, including bear, but you’re very actively involved in the artificial intelligence and machine learning working group with an open set, open SSF, could you share what that group is currently focusing on and how it’s helping shape the open source AI development space? Absolutely.
Jay White 06:32
So right now, underneath that group, we did a special interest group called model signing, a model transparency, and that group developed a very neat tool that does that does model signing, and does it in real time. You know, Mihai, my co chair, he does this wonderful demonstration of us. That’s one of the things that we’re working on. There. Another thing that we’re working on is ml, sec, ops. And Sarah Evans is spearheading that spirit, spearheading that work underneath the working group. But this is really kind of evangelizing, and taking what we do in dev, sec ops and in across the supply chain, and saying, Well, now let’s think about its focus on machine learning, and what is, what are the aspects there, what are the nuances there that need to be understood as we think about security operations, you know, we consider things like adversarial machine learning, right? We think about, you know, the different attack vectors and stuff like that. Well, then how do we mitigate a lot of that stuff through how we develop machine learning systems, how we develop AI systems, and how we develop, how we use machine learning to develop those, those AI systems. There’s a lot of things that are spanning from this, by the way, right? So, so even not, not just within, on the openness stuff, we look outside the openness stuff, there are a lot of organizations that are taking that thought and expanding that thought outward. So you have different work streams and working groups spin up to two. To take to take a look at different spokes of this wheel. But those are the things that we’re working on an AI ml security working group right now that we’re really trying to flesh out, while we have our tentacles spread out to other organizations to evangelize similar work, and then making sure that we uplift the work that we’ve already done.
Yesenia Yser 08:25
Yeah, I like that. AI and ML is just its own umbrella of just rooted nodes down there, or children nodes and down there that just branch out into who knows what. Right, it’s different things are coming along the way, as we are exploring down one path you know, agents is now, and who knows what else is coming next. So it’s cool. What days of the artificial machine learning group show up and meet? Just for folks interested in joining?
Jay White 08:55
Yeah, so every other Monday, right around 10am I believe our last meeting was, want to say it was last week. So our next meeting is next month. It won’t be next Monday. Next Monday is a holiday. I’m not sure if it was moved to the Monday afterwards. Please check the openness of calendar, but we usually meet every other Monday, bi weekly at 10am
Yesenia Yser 09:17
yeah, join on those calls and have your voice shared in perspectives in making AI great along the industry. And you know, you just don’t do work with the open SSF, you’re also partnering and making momentum within the coalition of secure AI cosi. I believe it’s how it’s pronounced. What are some of the latest developments that you can talk about and like, what are you most excited to see as we look ahead in the upcoming year.
Jay White 09:44
Wow. So I think I have to start the beginning of that. So last year, say, last year, I think, I think a year and a half ago, you know, we started down the road to understand supply chain security for AI systems and what, what? How do we. Standardize some of that stuff. How do we standardize the development and deployment of AI systems? And at the time this was, this was an area that was very, I won’t say I won’t say taboo, but people really were like, Wait a minute. We don’t want to stifle productivity. Here. We’re moving at 1000 miles per hour, and standardization could really impact that progress. Could really impact that productivity. That was the going conversation at the time. Now we see that standardization could actually be an aid here and so and so much that, you know, we started cosi, but after cosi, there were a number of organizations that jumped in and said, Wait a minute, we have a thought on standards too, right? So one of the things that we did in cosi was we first wanted to understand supply chain security for AI systems. Then we wanted to understand how to prepare the next cyber defender. So we have a cyber defender that’s that, you know, working on [indistinguishable] right now, defending against all types of application security issues, right? We have them trained on supply chain threat modeling and all that kind of stuff, right? And then we had something for AI security risk governance, right? So we had those three work streams. We added a fourth to deal with agentic AI, right? Right? So we just added that fourth one. One of the things that I’m excited about is our right now we’re doing a lot of white paper, writing landscapes, landscape writing, landscape white papers. And one of the things I’m excited about is the supply chain security stuff that we’re working on, and mainly the model signing and model card data stuff that we’re working on now. These white papers are not out just yet. We wrote them. They’re currently in review. They’re fantastic pieces of work, and when they come out, of course, those will be shared. But in terms of the work that we’re doing right now, that I’m especially excited about is our model signing stuff. And I’m excited about it because I don’t think it just stops there. I think that we do think about digital artifact signing in totum, and I think that model signing is that one aspect that could highly help out a lot, especially if we do it correctly with kind of with mitigating a lot of the forthcoming vulnerabilities around models that are being developed. There’s a model being developed, like, every five minutes. And then, of course, if you’re not signing them, etc, who built it, what’s in it, what the data set was it trained on, etc, right? And to that end, like, I get excited because we’re even working on data provenance, lineage, is tagging and stuff like that. That’s a whole different technical committee underneath Oasis, which, which calls for secure AI, is underneath as well. So, so there’s a lot of work happening in the space that I’m getting excited about, but that’s the work in cosi,
Yesenia Yser 12:55
yeah, hearing your voice, or just the excitement. And, you know, just like with the AI ml group, there’s just, it’s spanned out into its own umbrella of different things. How can someone, if they’re interested, is this something that anybody can join in a public forum and be a part of? I’m sure volunteer hands are always useful in things like this.
Jay White 13:16
So inside of cosi, anyone can join a work stream underneath the technical underneath the TSC, the Technical Steering Committee. Anybody can join the work stream. You do, you know, in order to be part of the TSC be probably you do have to be a member organization of Oasis. But right now, anyone can actually join a work stream. So if you feel like you want to get in and get to work. You know, go ahead on to the to the Oasis website. The work stream meets every two weeks, on Wednesday, at 9am and then, of course, every other every other two weeks. So, so right now, the way will work stream one is the work stream I co chair. So I’m talking about that the supply chain security and systems. That’s what I co chair specifically. So the one I co chair, you can get to it every other Wednesday at 9am and then every other Wednesday. So every every other Wednesday after that is the model signing at 9am and the model card data at 10am so you could find me every single week on Wednesday at 9am in one of these meetings, and you can come on, then you can get to world, obviously, get to work, because some fantastic work is happening. I would encourage, if for the people that’s listening, if you haven’t already, I would also encourage, you, know, getting getting some, some light education on AI ml, before you join that way, when you come in, you have a solid basis of understanding, and can come in and actually talk some stuff and then get plussed up on whatever you’re missing me for one, I went and did two different programs at MI. It so that it wasn’t like I stayed in the holiday in Express, no, I went out to school and stayed in school for quite help. I got in trouble because I used to go back and forth to Cambridge just to sit in class and do I said. But then I also earned a couple of certifications through through our company, Microsoft, right? I earned a couple of certifications with that as well. So, you know, when you come in, have an open mind, have a baseline level of knowledge that you can work from, and then be prepared to get plussed up, because some fantastic work is happening. The people in the room are just super smart. So, so that’s, that’s also an exciting piece, too. Yeah,
Yesenia Yser 15:39
I’m sure that the breadth of knowledge and expertise in that room is just, it’s exciting to enter. And, you know, with with us moving forward into the upcoming years, like AI and just ml are in just, they’re they’re evolving so quickly, and coursework and certifications are great. I myself am struggling just to keep up with it. I’m sure everybody else, especially our listeners are like, Yeah, we just especially with the umbrella, it’s become, what do you see as the biggest security challenges on the horizon? You know? What? What steps can we take now to mitigate these risks before they like scale, particularly in the open source space?
Jay White 16:17
One thing I want everyone to remember is that there’s nothing new under the sun. The same vulnerabilities that you had, that you have with regular supply chain security, the same vulnerabilities you had with development of systems, are not different than the vulnerabilities you encounter when it comes to AI systems. AI is well, AI systems are just systems that are built using machine learning. Are systems that are built just through humans putting together pieces of code and then using data sets to train this code to do a thing. So when you break down the different elements of how an AI system is developed, you could begin to clearly see where there are vulnerabilities. And some of these vulnerabilities, I say there’s nothing new under the sun, but at the same time, I’ll say some of this stuff can get kind of nuanced once you break it down. But have an open mind when you break it down, that you’re looking at something nuanced based on something that’s already pre existing, right? So, so challenge your mind to go to I’ll give you an example. I’ll give you an example. One of the things that is not nuanced are prompt injection attacks? For instance, I don’t think prompt injections because, because sequel injection attacks, prompt injection attacks is not different than SQL injection attacks, in my mind, right? SQL injection attacks are old as hell. You figure out a way to trick a SQL database into giving you information you want based on is inability to error check, Error Check properly, or check for the length of strings, etc, right? So you didn’t do that. There prompt injection attacks. No different, you didn’t train your model. You didn’t implement the right controls in your model enough to withstand certain types of prompts that will make it either hallucinate or do whatever it needs to do or produce some information right? The Same, same. Now, what is? What is nuanced? What is nuanced now is the different cultural representations, even in a small, localized area, and depending upon the data that a model has been trained on, depending upon the especially when it comes to natural language processes and and when it comes to things like, when it comes to, you know, images and stuff like that you think about, you know, we’re all different kinds of people, who speak different, who think different, who walk, talk, act different, and when now the human element is involved with how you with, with the communication, with the respective system. Sometimes it ain’t even an attack necessarily, or sometimes it’s just us being us, and the system that was developed doesn’t know how to interpret it by the way, these systems are getting better at this, because there’s more diversity of thought that goes into how some of these systems are built. But that wasn’t always the case. So So you so the nuance here now in the vulnerability is not necessarily something that somebody’s done maliciously, but it would be the lack of that, that that diversity of thought that would enter the model and went into how the model was trained, that. Could elicit now a response that’s undesired based on the user that’s using it, that’s nuanced. The how is not really noise. You understand I’m saying so, so, so when you when you consider the the the one side versus the other with that one, those are the things that I’m looking at as I continue as as as ours, as these systems get more and more and more and more advanced, and as the people who develop them get more and more and more advanced, I would love to see many different communities with some smart people in them develop systems that complement respective communities. No, you don’t want to segregate. You do want to integrate, but it’s only through getting these systems built by these respective communities, where two communities can join together, integrate systems and build something that’s usable across the different the different populations and elements in A community. My thoughts.
Yesenia Yser 21:01
I love it. Bring communities together. Build a bigger community. We’ll solve these problems at the end of the day. You know, it’s, it’s solving a whole global issue, especially with when it comes to open source. Well, that is the part of the interview question. We’re going to move on to the rapid fire. Pew, pew, pew, pew, pew, pew.
Jay White 21:22
Oh, before we move on, I do have to say I promised I would do this. I didn’t do this before, and I should have done this in the beginning. Please understand my thoughts, my opinions are mine and mine alone, and not those of the organization I represent, Microsoft, if they are mine and mine alone. And I would like to say maybe not, part of the organizations I represent, an oasis and an OpenSSF is, although I am talking about those organizations, when I come up with these fantastic thoughts and ideas, I do want to say that before we continue.
Yesenia Yser 21:54
Thank you. Jay, yeah, they’re fantastic thoughts. I love it. I it’s always interesting to hear your perspectives on, you know, the different areas, especially the areas that you’re in. So let’s move forward to rapid fire. First question, UFC, or boxing?
Jay White 22:13
UFC
Yesenia Yser 22:15
All right, I had to ask you. Had It Coming?
Jay White 22:20
I didn’t even understand that question coming from you. You already know.
Yesenia Yser 22:24
I already know, but our listeners don’t know. Our listeners don’t know, you know, dark mode or light mode,
Jay White 22:31
dark mode
Yesenia Yser 22:33
in Brazilian Jiu Jitsu Gi or no Gi.
Jay White 22:37
Oh, come on. Now, that really depends. No Gi.
Yesenia Yser 22:41
trying to cause chaos, you know, early bird or night owl.
Jay White 22:45
Night Owl, well, Night Owl who gets up at 330 in the morning. How about that
Yesenia Yser 22:52
A little blend, uh, sweets or sour?
Jay White 22:55
sweet.
Yesenia Yser 22:56
All right? And this one, this one’s the curls question, JIFF or GIF?
Jay White 23:04
Oh, wow. Um, that I saw coming from the it’s GIF, and they’ll always be GIF, because there’s no it’s always GIF, right? It’s graphic. The first word that is graphic, like what I don’t understand. Why people don’t even say people got to jiff.
Yesenia Yser 23:23
potato, tomato, you know it is there you have it, folks, just another rapid, chaotic fire. Thank you. Thank you for working through that. Any last minute advice or thoughts for the audience?
Jay White 23:38
Get involved. Get involved. Stay involved. You know, everyone’s your time gets limited. No, you know, God, we have been flowing and, you know, but, but do your best. But get involved that this, this is a this community is, um, is wonderful, and it’s ever growing. So it doesn’t matter where you get involved, just get involved. Just get involved. And I guarantee you’re going to cross paths with some amazing people. You know, network, these are unchartered times, and really, man, these times are rough, so network and all that kind of stuff, and make sure that you stay in a place where people can see
Yesenia Yser 24:24
you. Yeah, your network is your net worth. Jay, as always, I appreciate your time. Thank you so much for everything you do in our communities and making sure that our communities are communicating and moving forward to more positive outputs. Thank you to our community of contributors for driving all these projects forward for those that could not speak today, and I look forward to seeing what is done in the future. Thank you. Thank you all very much.
Outro 24:51
Like what you’re hearing, be sure to subscribe to What’s in the SOSS on Spotify, Apple podcasts, antenna pod, Pocket Cast, or wherever you get your podcast. Guests. There’s a lot going on with the open SSF and many ways to stay on top of it all. Check out the newsletter for open source news, upcoming events and other happenings. Go to open ssf.org/newsletter to subscribe. Connect with us on LinkedIn for the most up to date open SSF news and insight and be a part of the open SSF community at open ssf.org/get, involved. Thanks for listening, and we’ll talk to you next time on What’s in the SOSS you.