AI, Software Development, Security, Tips, and the Future (Part 2)

By David A. Wheeler

This is part 2 of a 2-part article where I’ll briefly discuss the impact of Artificial Intelligence (AI) on software development. In part 1, I noted that AI use is now the norm, despite frequent errors in AI-generated results, because productivity is king. I also discussed its security implications. In part 2, I’ll continue by offering tips for software developers so they can better use AI and also look to the future.

If you missed the first part, read: AI, Software Development, Security, Tips, and the Future (Part 1).

Tips on AI in software development

We, as software developers, are all learning how to use AI. Here are some tips, based on my own personal experience. These aren’t security-specific, though many do help security. AI is not human or sentient, but I’m going to anthropomorphize here because it’s easier to explain things that way.

Give clear, precise instructions at a higher level. Make sure the instructions are as clear and precise as possible. The result is more likely to be what you want, and you’ll spend less time and money, too. Consider writing instructions in a separate file, rather than directly in an AI prompt. Writing in a separate file will encourage you to spend more time refining the instructions and make it easy to ask the AI to review them first.
1. For non-trivial tasks (like most software development), start by stating the higher-level goal, providing context and rationale, and asking for its review and best options.
2. Don’t demand a specific implementation approach or format before asking it to identify and consider the options. An AI can sometimes suggest an especially good approach, but only when it has the information it needs and is asked for options. For example, I wanted to add a logging mechanism that recorded results in a “simple common format”. I suggested JSON as an example, but didn’t require it. The AI recommended using the JSON Lines (JSONL) format instead. This was a good idea for my use case that I hadn’t considered.
Provide relevant details. AIs will sometimes produce extraordinary results if fed detailed, relevant specifics. For example, AI can sometimes do a great job at a localized security review given a specific file location, a specific concern, and detailed information relevant to the question. This means AI tends to be more effective in an expert’s hands, because experts better anticipate the details that will matter.
1. Don’t assert something if you’re unsure of it; AIs are gullible and will often believe what you claim.
2. Provide links to key data or extractions of relevant data. Relevant extractions are better. Don’t dump lots of irrelevant information on it; that can overwhelm it and impede its ability to determine what’s important.
3. When asking it to fix a bug, provide specific information about the bug, such as the triggering input, the source file and line, a test/error report, a picture showing the problem, and a description of the expected behavior.
Ask it to create a plan/proposal. Have it make a plan, have it review the plan, then review/edit it yourself. This isn’t perfect, but the process can be surprisingly helpful.
Work incrementally, with continuous human review & interaction. AI systems easily go “off the rails”. Work in smaller steps, incrementally, so that you can monitor progress and can quickly redirect work back to a productive path. Check increments into version control, so that you can easily revert.
Don’t simply tell an AI “you are a security expert…”. Experiments have repeatedly shown that this doesn’t improve the security of the results. Instead, see the OpenSSF “Security-Focused Guide for AI Code Assistant Instructions”.
Most AI is intellectually lazy. It will tend to prefer the “first thing it can think of”. Force it to instead identify alternatives and carefully examine each one. My solution, where sensible, is to ask the AI to list all plausible options, and for each one it must identify its pros/cons, prove or refute each one, and provide supporting information.
Don’t confuse confidence with competence. AI often provides wrong confident answers with flattery. It’s hard for AI systems to evaluate likelihood (which requires more analysis). In addition, consumers reward confidence and flattery. They’re a key paying market, so AI system makers are incentivized to make them that way. Treat AI results with suspicion; don’t trust the system merely because it seems confident or smart.
Create tests early. Create tests, including negative tests (tests to verify that what shouldn’t be allowed is not allowed). Review those tests before accepting them. Once you accept tests, they will guide the rest of the process. So if the tests are wrong, they will misguide the process. Don’t accept all tests; be selective. Once you accept tests, they should be part of your CI/CD pipeline that checks each proposed change.
Fight for simplicity. Work to minimize the code that’s generated. Demand that generated code reuse libraries and methods already in the system. AI systems often generate repeated code fragments, creating a maintenance nightmare due to limited reuse.
Fight for performance. AI-generated code often recalculates values, even in tight loops. Consider moving calculations out of a loop, precalculating values during program initialization, or “freezing” values as constants. Computer systems are faster today, but not infinitely fast, and AI systems sometimes generate spectacularly inefficient code. Even a little human time on this can significantly help runtime performance. This also saves development time, because reviewers won’t need to puzzle through odd constructs such as recalculating a presumably constant value inside a loop.
Develop and maintain software documentation. AI works better with good documentation, such as inline code documentation, higher-level documentation, and user documentation. If (like most) you lack some documentation, AI can help you write it, but again, review the results. If you’re using AI to create documentation, work bottom-up, so that the AI can maximally build on other documentation.
Encourage public examples of technology and common conventions. Current AI is built on machine learning (ML), so it performs much better when there are many publicly available examples and standard patterns to learn from. For example, AI tends to be relatively good at Python and Java, as those are popular languages with many public examples. I’ve found AI is really good at Ruby on Rails (RoR), as there are many publicly available examples of RoR code and RoR strongly prefers standardized patterns. AI is even good at Inform 7, a niche programming language for creating interactive fiction, because there are many publicly available examples of such code. In contrast, current AI tends to be relatively poor at COBOL because there are few publicly available COBOL programs for ML systems to train on. If you want AI systems to become better at COBOL, support Zorse, which collects COBOL programs so they can be included in the training sets of foundational models.
Continuously learn to use AI. Software developers worldwide are trying to determine what these tools are good at, what they aren’t, and how to use them. Continuously learn from your own experiences and from others’.

I’ve often received excellent results when I followed this guidance. Here’s an example of a decent prompt: “On file F line L, is there a cross-site scripting (XSS) vulnerability? Provide a detailed logical justification for your answer, identifying relevant files and line numbers. Note that the translation call on this line will return a value of type SafeBuffer, not String, because the provided translation key ends in _html.”

AI can perform dramatically better when properly prompted, such as by asking focused questions, demanding evidence, and providing relevant specifics. This is why people report good and bad results from AI. The quality of AI results often depends on how it’s used.

Current AI doesn’t replace humans in software development; it assists them. Don’t expect too much or too little.

Future

Let’s discuss the future in terms of clarifying terminology, improving AI technology, and the future impact of AI on software development. We’ll then ask that you consider playing a part in that future.

Clarifying terminology

In the short term, we have a terminological problem in software development that’s causing significant confusion and risk. Since AI systems frequently make mistakes when generating code, we need to distinguish between two different ways to use AI:

Using AI without human review or editing.
Using AI with human review or editing.

Andrej Karpathy proposed the term “vibe coding” for the first case (when you’re using AI without human review or editing). Using AI without human review or editing can be great when there are no repercussions for errors, such as introducing young children to software development. Unfortunately, using AI this way can easily lead to false results and catastrophic losses (such as the loss of a production database). I’m really glad Andrej Karpathy coined this term, because it’s important to have a term for this. Merriam-Webster appears to use this definition; their page on vibe coding says the developer “does not need to understand how or why the code works”.

Unfortunately, a few people use the term “vibe coding” to mean any use of AI in software development. We don’t need a term for that, as we already have one – it’s called “software development”. Using AI in some way is the normal case in software development today, so it doesn’t need a special term. In the rare case you need to discuss software development without any generative AI, you could use a phrase like “manual software development” or “software development without AI”, but there’s usually no need to discuss such an unusual situation.

I hope that people will broadly accept Karpathy’s original definition of “vibe coding” so that we have a clear term for using AI to develop software without anyone reviewing or editing the results. Vibe coding (as originally defined) can be great, but it’s also quite dangerous. We need a clear, broadly accepted term for this sometimes-helpful yet dangerous practice.

Improving AI technology

AI systems will get better, but we don’t know by how much or when.

Obvious and widely encountered bugs will be rapidly fixed in AI systems, as long as they don’t require changing fundamental AI technology itself. For example, earlier versions of GitHub CoPilot were notorious for often not waiting for commands to complete and instead assuming they were done. This bug made GitHub CoPilot look much less capable than it really was. Similarly, older versions of Claude Code constantly generated junk spaces at the end of lines, no matter what it was told. Remember, these are brand-new systems, barely out of the lab, that are suddenly being used in real-world situations. We should expect some teething pains. Those small fixes will continue, making the systems more useful. Celebrate those fixes!

The underlying Large Language Model (LLM) technology will improve. The question is – by how much? Will they become dramatically better? Will they dramatically reduce their likelihood of errors? For example, will they become so good that AI will mostly replace human software developers within the next 5 years?

My guess: no, not in the next 5 years. There are many investments in AI, and those may lead to large payoffs in the long term. However, my guess is that for at least the next 5 years, the improvements will be smaller and incremental. A key problem is that we are running out of human-generated data to train the models. We can increase parameter sizes and context windows, or tweak LLM architectures, but I think that’s likely to yield at most incremental improvements. There’s a lot of current work to make AI more efficient, but that won’t improve its capabilities, that will lower its costs. That doesn’t make AI useless. Improvements are normally incremental, and improvements are improvements!

Today, much isn’t known about securely using AI. The CaMeL architecture is the only truly secure AI design pattern we have, but it only applies in some cases. We need academics to develop new approaches to securely using AI, rather than publishing useless papers on approaches that attackers can trivially subvert. Defenses must work even when attackers know how they work. Too many of today’s AI “defenses” fail this basic criterion by unwisely depending on “security through obscurity”, and such defenses eventually fail. I believe much can be done if many smart people focus on real-world problems. I hope many will.

I’m skeptical that Artificial General Intelligence (AGI) will be created within the next 10 years. No one has any idea how to actually create AGI. LLMs are useful, but they just repeatedly guess the next most likely token (word fragment). That approach is useful but has serious limitations. What’s more, that’s okay. All technologies have limitations. I’m satisfied with limited technology if it’s useful. I just need to understand what it’s good for, what it isn’t, and how to use it well.

AI Impact on software development

It’s too late to predict that “software developers will adopt AI during development”. That already happened. Let’s talk about future possible impacts.

AI is going to make it even harder for new programming languages to gain adoption. Even if a language is mature for some use, there won’t be a large training set of example programs, so AI systems will be less capable of generating it. Odd as it may seem, expect programming languages of the next 5 years to be the current ones.

In the short term, I expect AI to increase the volume of exploited software vulnerabilities. Many developers will simply accept what the AI does (basically vibe coding), or review software while not knowing how to look for vulnerabilities (because they don’t realize that review is a key part of the job with AI). Attackers will use AI to find software vulnerabilities and exploit systems with them. The result will be an explosion of software vulnerabilities and their exploitation.

I have hopes that this will be a shorter-term impact. Existing non-AI tools can be used to help find some vulnerabilities; they simply need to be integrated into CI/CD pipelines. AI can also be used itself to search for and fix vulnerabilities. As developers learn that review is a key part of the job, they’ll start to do more of it, and AI can help them do it. One great thing about fixing vulnerabilities is that there’s a finite number in a program; once fixed, attackers can’t exploit vulnerabilities that don’t exist. If software developers start to acknowledge that developing secure software is part of the job, and use AI to do it, then they’ll have some obvious steps to accomplish this. If you develop software, please take our free course Developing Secure Software (LFD121) to learn more.

AI is not eliminating open source software (OSS), but will instead encourage its use. AI-generated code depends on OSS, just like everything else. AIs often make mistakes; even expert humans make mistakes. Pre-existing code that’s been tested, reviewed, and widely used is still more likely to be correct. AI often suggests OSS packages even when the human didn’t think to ask to use them. What’s more, AI can help create and improve OSS, including identifying and fixing vulnerabilities. While AI can be used to send endless garbage proposals to OSS, AI can also be used to review proposals, review code, and develop real improvements. I have reasons to hope AI will be a net positive in the longer term.

A key risk of using AI code assistants is ambiguity regarding the license of the resulting code. An AI may generate code, and in at least the US, it would typically have no copyright. However, if it’s pre-existing code (or clearly derived from something pre-existing), that doesn’t invalidate any existing copyrights, and the original licenses govern. The Linux Foundation (LF) “Guidance Regarding Use of Generative AI Tools for Open Source Software Development” applies to LF projects, and I think it’s useful guidance for others. It says, “if any pre-existing copyrighted materials… are included in the AI tool’s output… the Contributor should confirm that they have permission… such as the form of an open source license [to] contribute them to the project.” Current AI technology doesn’t have a practical way to track provenance at this detailed level; you could ask an AI system “what is the provenance of this code” but its answer is not necessarily the real answer. We also don’t know how to practically extend our existing AI systems to reliably track provenance at this level of detail. It seems wiser instead to develop improved tools to detect when new results are especially likely to constitute copyright infringement. Some tools already do this to some extent, and I’d like to see more of that as a pragmatic solution. I think many organizations should work together to make it easier to detect potential issues.

Improve the future

Modern software development includes the use of AI despite its limitations, such as frequent errors and a lack of context. If AI is used wisely, it can improve productivity and security. The trick is to use it wisely. Everyone involved in software development should help create and promulgate guidance on how to use it effectively, as well as encourage the technology’s maturation.

If you’re interested in the intersection of AI/ML, security, and open source software, please check out the OpenSSF AI/ML Working Group (WG)! This group co-led the development of Secure AI/ML-Driven Software Development (LFEL1012) and is actively working on other important projects. Please join us and help us create a better future.

About the Author

Dr. David A. Wheeler is an expert on developing secure software and on open source software. He created the Open Source Security Foundation (OpenSSF) courses “Developing Secure Software” (LFD121) and “Understanding the EU Cyber Resilience Act (CRA)” (LFEL1001), and is completing creation of the OpenSSF course “Secure AI/ML-Driven Software Development” (LFEL1012). His other contributions include “Fully Countering Trusting Trust through Diverse Double-Compiling (DDC)”. He is the Director of Open Source Supply Chain Security at the Linux Foundation and teaches a graduate course in developing secure software at George Mason University (GMU).