From a cybersecurity perspective, the strengths of artificial intelligence (AI) and machine learning (ML) are also weaknesses. The capacity to crunch massive amounts of data, identify patterns, and learn while working covers a lot of territory, but also leaves room for vulnerabilities, which Pentagon and Intelligence Community (IC) researchers want to close up. And the job doesn’t look easy.
New attack strategies are created as soon as defenses against others are developed, leading to an “arms race” that, at the moment, isn’t promising for organizations trying to defend against attacks, according to a Special Notice posted Jan. 24 by the Defense Advanced Research Projects Agency (DARPA). “The field now appears increasingly pessimistic,” DARPA said, “sensing that developing effective ML defenses may prove significantly more difficult than designing new attacks, leaving advanced systems vulnerable and exposed.”
DARPA’s notice announced a new program, Guaranteeing AI Robustness against Deception (GARD), which aims to find new ways to defend against what it calls adversarial deception on ML systems, so that those systems, as smart as they are, aren’t so easily fooled.
The IC’s Intelligence Advanced Research Projects Activity (IARPA) also is focusing on a specific example of the problem with its new TrojAI program, which looks to prevent Trojans from being introduced into AI or ML system training data, allowing an attacker to take control at a later date.
“The growing sophistication and ubiquity of ML components in advanced systems dramatically increases capabilities, but as a byproduct, increases opportunities for new, potentially unidentified vulnerabilities,” DARPA says in its notice. And attackers seems to have the upper hand. “As defenses are developed to address new attack strategies and vulnerabilities, improved attack methodologies capable of bypassing the defense algorithms are created.”
Defending against those attacks is currently being hampered by a lack of a full understanding of adversarial attacks, which leave open blind spots that can be avoided, the notice said. The GARD project has three goals: develop a theoretical foundation for defensible ML, including ways to measure vulnerabilities and identify ways to make ML systems more robust; create and test defense algorithms in diverse settings; and create a scenario-based framework for evaluating defenses in multiple settings.
Current defenses are designed to counter specific attacks. GARD would develop general defenses that would work against broad categories of attacks and threats that can change tactics. The test framework would take the measure of defenses against a variety of scenarios, including physical world models, poisoning and/or inference time attacks, attacks using multiple modes (such as video, image, and audio), and situations where the attackers, or defenders, have varying levels of skills and resources.
IARPA’s TrojAI project, meanwhile, targets a kind of poisoning attack in which a Trojan implants a trigger into an AI’s training program. Through the use of that trigger, an attacker could take control of an AI or ML program at a specific time. In announcing the program, IARPA offered the example of how a sticky note could wreak havoc with a self-driving car. The trigger tells the program that a small colored square indicates a speed limit sign. So when a sticky note is placed on a stop sign, the car runs the stop sign, putting pedestrians and other drivers at risk.
It’s potentially that simple, and could be applied in a lot of other situations where AI programs make decisions. IARPA said defending against such Trojans is complicated by AI’s learning ability and their training, which often involve large, crowdsourced data sets. Defending against such attacks requires examining an AI program’s internal logic, as well as depending on “the security of the entire data and training pipeline, which may be weak or nonexistent,” IARPA said.
Another IARPA program, Secure, Assured, Intelligent Learning Systems (SAILS), is addressing privacy attacks, in which exploits use output predictions to reconstruct training data, find the distribution of information from training data, and perform membership queries for specific training data. SAILS is intended to develop defensive steps to protect training data.
DARPA has scheduled a GARD Proposer’s Day for Feb. 6, with registration due by Feb. 1. IARPA will hold a SAILS Proposer’s Day for both TrojAI and SAILS on Feb. 26, with registration due by Feb. 20.