WASHINGTON — For years, Pentagon leaders have argued that cybersecurity, like missile defense, was a natural place to start using artificial intelligence: high-speed, high-stakes, with too much data coming in too fast for a human mind to comprehend. But, amidst the current AI boom, have algorithms materialized that can help cybersecurity today?
“So far, not really,” lamented David McKeown, the Pentagon’s senior information security officer and deputy CIO for cybersecurity, when the question came up at an AFCEA TechNet Emergence panel Monday. “I’m disappointed.”
“I’ve been searching for use cases where AI is being used to do cybersecurity things and, so far, I’m not seeing too many,” McKeown continued.
The absence is palpable in industry, he said. “I took a group of CISOs to AWS [Amazon Web Services] on Friday, [and] they haven’t developed anything that I can tell.”
The absence is palpable in the Pentagon, too.
“Inside the building, we have [a] CDAO that’s in charge of our data and our AI future,” he went on, referring to the Defense Department’s Chief Data and Artificial Intelligence Office created in 2021. “They’re focusing on improving senior-level decision-quality data, they’re focused on warfighter missions where they’re leveraging AI, it’s being used for optimizing the maintenance cycles of bomber aircraft and saving us billions of dollars — but I am in search of ways we can leverage AI that are on the cybersecurity front.”
So, McKeown told the contractor-heavy audience at AFCEA, “if you happen to know of any products out there that are leveraging AI to do cybersecurity things, I’d love to talk to you. This is a key part of our zero-trust strategy.”
On the sidelines of the conference, McKeown wanted to make clear that his public speech was a “data call” asking for industry to offer AI options, and not a dismissal of the tech. “Don’t tell people I’m not going into AI, [because] I love AI. I want lots of AI,” he told Breaking Defense.
With such a strong demand signal from the Pentagon, and with cyberattacks growing in scale, frequency, and sophistication, why has no one built a good-enough AI defense? In brief: because it’s hard.
Industry is trying, said Kyle Fox, chief technology officer at SOSi, which helps defend the Pentagon’s multinational Mission Partner Environment. He said his team is using AI tools already for at least some aspects of cybersecurity.
“It is absolutely early days and there’s not a lot of commercial turnkey solutions,” he told the AFCEA panel audience, “but I really urge everyone to experiment in this space.”
Why is cybersecurity AI so far behind other areas like Large Language Models and imagery analysis? It turns out that the kind of machine-learning algorithms that can crunch consumer data and scour public websites to generate creepily accurate product recommendations, human-sounding (if insipid) text or photorealistic people with too many fingers have a lot more trouble mastering the complexities and subtleties of cyber defense.
For one thing, cyber defense isn’t just defending cyberspace anymore, because software is increasingly used to run physical systems. And while algorithms are great at digesting digital data, AIs still struggle to understand the messy, analog world of physical objects, machines, and infrastructure and how they interact — for example, in the complex interconnections of a weapons system or a civilian power grid.
“We tend to look at software as something that can be analyzed on its own,” said Emily Frye, a cyber expert at the thinktank MITRE, speaking on the AFCEA panel alongside McKeown. “Software and hardware are often — not always — inextricably intertwined.”
“Software … it’s in every piece of critical infrastructure the United States operates,” added Fox, who spent 12 years as a software engineer in the US Air Force. That means all sorts of mundane but vital machinery can be hacked. “This problem is getting harder, not easier, [and] we’re not winning … We’re actually kind of losing right now.”
Small, specialized software programs known as firmware have long been embedded in industrial and consumer devices, from pipeline control to baby monitors. The movement to computerize ordinary objects and wirelessly network them into a massive “internet of things” puts software inside hardware of every kind — and that software may come with unintended bugs or worse, intentional backdoors.
Finding all these subtle faults, concealed in countless lines of code, is precisely the kind of massive, detail-oriented, brain-breaking task that AI optimists say their algorithms can help human analysts handle. But, as McKeown and his fellow panelists warned, the AI has a long way to go.
The Call Is Coming From Inside The House
Teaching AI to do cybersecurity keeps getting harder because best practice has shifted dramatically, both on the defense and on the attack. Once, cybersecurity meant perimeter defense: firewalls, passwords and automated virus filters. But it soon proved impossible to stop every attacker at the firewall, which meant smart cyber defenders had to assume a smart enemy was inside the gates already lurking somewhere on their networks and every user had to be watched for signs their account was compromised — what the cybersecurity industry calls a zero trust defense.
Attackers, meanwhile, increasingly stole real users’ credentials — often obtained through phishing emails and other “social engineering” tricks that rely on human error, not defeating a machine — and then “lived off the land” by exploiting the software already on the target network, rather than uploading their own code. Instead of searching for readily identifiable malware and other foreign code, defenders now had to search for anomalous behavior by seemingly legitimate users using legitimate software.
“They’re going to get on your networks, and in a lot of cases when they arrive there, they’re going to look like a legitimate user. Oftentimes they’re not uploading any payloads that have signatures that our tools will see,” McKeown warned. “[So] we’ve got to get really good at knowing what anomalous looks like. … What’s in the realm of the possible for anomalous behavior detection?”
Detecting patterns and anomalies is generally a good fit for AI, but it needs a lot of data to train on. That data is often not available. Few networks are fully “instrumented” to monitor and log legitimate user behavior, experts note. When it is, the data only shows what’s normal on the specific network it was drawn from: A different group of users using identical software might exhibit very different behaviors, leading a security AI trained on someone else’s “normal” to shut down legitimate operations.
What’s more, the applications used on a modern organization’s network are numerous and ever-changing, with old software regularly updated (often to patch cybersecurity holes) and new software being added all the time. Increasingly, the threat is no longer outsiders uploading obvious malware, but insiders building backdoors into the software your own IT department is buying. Even if the prime contractor is reliable, can they vouch for their subcontractors and sub-subcontractors writing specific chunks of code that go into the final application?
This cybersecurity “supply chain” problem can be overwhelming, warned Zac Burke of VTG, formerly head of a Pentagon supply chain program called Iron Bank. “You really don’t gain an appreciation of the problem until the problem gets dropped on your lap to solve,” he told the AFCEA audience. “The amount of [software] artifacts that just DoD uses, there’s hundreds of thousands of them.”
Executive Order 14028 tried to solve this problem in 2021 by, among other things, setting standards for code to come a “Software Bill of Materials” (S-BOM), basically the digital equivalent of a nutrition warning label that tells users what ingredients are inside. But S-BOMs are only as good as the integrity and competence of whoever wrote them — or rewrote them. “It’s as easy just opening up a text editor and modifying the S-BOM,” Burke said. “Our assessment is the only way you can trust an S-BOM is if you build the software yourself.”
What about reviewing a vendor’s software code before you upload it to your system? Even if the buyer has the expertise to make that assessment, the panels’ experts said, government contracts rarely give them the right to do so. Given software vendors’ sensitivities about outsiders leaking their intellectual property or disclosing an unpatched vulnerability, they’re unlikely to sign contracts that expose their proprietary code in the future, either.
“They’re never going to share their actual code, [but] I don’t necessarily think I need to see the code,” McKeown said. “I just need to know they’re developing in a secure environment.”
Alternatively, Burke suggested, you can try to buy code that’s open-source. In that case, instead of trying to protect the code by keeping it secret, the developers let everyone look at it for free, hoping that good guys looking for vulnerabilities to fix will work faster than bad guys looking for vulnerabilities they can exploit. But that’s not viable for many military missions.
Sometimes the only solution is to take apart the finished software and try to reconstruct the original source code, for example by analyzing the binaries of the software to be installed.
“We can reverse-engineer binaries back to source code — we think, theoretically, at scale,” Burke said. “We’re doing some experimentation.”
“I’ve never heard much about that,” McKeown commented. “I understand why you’re doing it, [but] it’s a little scary — if we can do it, the adversaries can do it to us as well.”
UPDATED 3/15/24 at 9:30am ET to correct Kyle Fox’s title at SOSi.