Data & AI Vulnerabilities: How Prepared Are We for the Next Wave of Threats?
This is the Three Data Point Thursday, making your business smarter with data & AI.
Actionable Insights
If you only have a few minutes, here’s what you’ll need to protect your systems against:
Recognize the Inevitability of AI System Exploits. Every company is becoming more vulnerable. Both the exposed surface area of your business's data & AI systems and the severity of potential attacks increase day by day.
Little is known and can be done. So far, directed attacks are few and far between, but what we get from research is worrisome: we don’t yet have good ways of fighting attacks.
Sleeper agents inside systems are possible. A recent study has shown, that sleeper agent like behavior in AI systems is possible. AI systems with such behavior act normal, go undetected by todays apparatus of security measures, but behavior 180* opposite when prompted with a special key phrase.
Everything has a bias. There is no bias-free data & AI system. There are dozens of ways in which biases get injected into your systems. Not even your data warehouse is without bias. There’s no way around it, just don’t assume at any point, your systems are free of it.
No one is able to inspect it all. Not the data inside your data warehouse, not the training data for your AI system, not the API, or the weights of your neural networks. If something breaks, you likely won’t know for a while, and if it does, it takes time to fix. Account for that.
The future of data & AI security is vast, challenging, and exciting but not necessarily bright.
Anna was born in 1982 as daughter to a diplomat, learning how to work the high society from early on. She went on to earn a master's degree in economics from a renowned university and started to work in London in 2003. In London she met Alex, fell madly in love and married shortly after.
When Anna moved to London permanently and gained her citizenship after a divorce from Alex, she decided to move to the U.S., to New York, building up a property business with over 50 employees by 2009. She was living the American dream, indulging in everything NYC had to offer.
"it was the best of times, it was the worst of times" - Anna (Quoting Charles Dickens)
On 27 June 2010, however, her life took an abrupt turn: She was arrested for espionage. Anna Chapman, born Anna Vasilyevna Kushchenko was in fact a Russian sleeper agent, part of the so-called Illegals Program, a ring of ten Russian sleeper agents living in the US for years, assuming false identities and integrating into American (and British) society while secretly carrying out intelligence-gathering missions for Russia.
Sleeper agents are a powerful tool in the battle over one thing: information. They gather information, potentially manipulate information, and optionally act and carry out sabotage missions based on it. They are called sleeper agents because they look like a normal part of the system, in the case of the Illegals Program of American society. But they are not, they are dangerous, harmful and very valuable to a competing system - the Russians.
The value of sleeper agents is in the millions. Russia spends tons of money and decades to develop and train them, to have them infiltrate foreign societies.
At the core, sleeper agent programs boil down to just one thing: Information and the massive value information has in today’s world. So much value that:
Competitors, in the general sense, spend millions to get, manipulate, and use it.
We should spend millions protecting it, protecting against manipulation and the potential use of it.
But this isn’t a newsletter about espionage; it’s about data & AI. You might not want to hear it, but the hard truth is: your data and your systems are just as vulnerable as our societies to such exploits. Only little has been made public so far about exploits of data & AI systems, but IMHO it’s inevitable data and AI systems will be exploited just like the U.S. society is. Maybe they already are, and we just don’t know it.
Note: This article will seem rather gloomy; it is not my intention to scare though - I’m a fan of all the technologies. I’m writing this because I think these vulnerabilities are hugely underrated, so in bringing them to your attention - even without a good way to resolve them, I hope to help you be aware that you need to act and that data & AI safety will be a hot topic for the future, and that it should be for you, today.
The rise of vulnerabilities in data & AI systems
I’m a vulnerable person. There’s a lot of things than can cause me harm personally. There’s nature, and nature isn’t fun at all. An earthquake or a thunderstorm might just kill me or seriously injure me. There’s man made sources of harm like a car that might hit me, or an exploding power plant polluting the air around me. Finally, there might be people who want to harm me intentionally.
Being vulnerable means having a non-zero chance of getting harmed, being damaged. The sources of harm can come in the forms described above. But having exposure is just one of two factors of vulnerability; the second one is the maximum harm that one exposure can cause. If I do not buckle up, my car crash will likely end deadly. When I buckle up and drive a well-maintained car with airback, my chances are pretty decent.
So, vulnerability equals exposed surface size times potential damage.
In the past two decades, the data & AI world has seen an exponential rise of exposed surfaces (just ask my phone - yes it can answer based on data & AI) and potential damage (autonomous weaponized drones can kills hundreds of people.)
In particular, for every data & AI system, and that includes for software system, or your data warehouse, six particular fronts, or surfaces got exposed more and more, and the potential damage keeps on rising as we integrate all these systems deeper into our lives.
Continual data inflow: Data is fed into our systems daily, every minute, every second. Data warehouse receive new data, new accounts are created, tweets are sent. All systems inside a company adapt based on this data, ML systems react to changes on a second by second basis to push the best tweet, video or provide you with a good answer.
Start data: All systems eat up an exponentially growing mountain of data, just to get started. ML systems are trained on wast amounts of data so big, no human can read every line of it in his lifetime. But that’s also true for the names of the Twitter accounts, or the messages sent on your platform. This is true for the sales numbers inside your data warehouse and basically everywhere inside your company. Data is used to kickstart your systems, and no one is able to check it by hand.
Algorithms: Your systems run on algorithms, more and more as more libraries are created to make it easier. PySpark processes massive amounts of data, Tensorflow allows you to create dozens and dozens of machine learning models each day. The availability of knowledge and tech is pushing the complexity of algorithms in use inside every company …
External dependencies: Software hides behind external dependencies, libraries, and packages. It also hides the complexity behind APIs. Let’s face it, you’re not going to train a ChatGPT-like model if you can simply use the API. Every machine learning library besides tensorflow and PyTorch has failed. Every piece of machine learning code, every datawarehouse relies on dozens of external libraries and algorithms that even if you wanted to, you could not look into.
Code: Then there’s an exponentially growing amount of code, some of it now auto-generated by systems like GitHub CoPilot. As code runs more and more of critical infrastructure, the potential damage rises for all of us.
Humans designing & coding these systems: As we humans design more algorithms and data systems, we’re becoming an exposed surfaces ourselves. We make mistakes; we use our biases to design systems. It’s how we humans work, but it also exposes more surface area for attacks.
No matter who you are, you’re vulnerable, your exposed surface area is huge, and the damage can bankrupt your business. No matter how it happens, whether by natural cause, as an effect of a human system, or by intention, you need to be prepared.
I’m worried in particular about four kinds of vulnerabilities I’d like to share with you today, so let’s take a brief look at them!
Sleeper agents inside ML models
Anna Chapman, one of the sleeper agents from the Illegals Program, wasn’t arrested because she continuously acted weird. The whole idea of sleeper agents is a natural behavior. She acted like any other immigrant who wanted to live the American dream.
Until one point: The FBI laid out a trap for her, an agent posing as a Russian intelligence officer asked her to pass on a passport to another spy. She accepted and told the FBI agent in disguise she was ready for this next step. In other words, she deviated from her normal behavior upon a certain prompt. One single situation triggered the sleeper to awake.
Our data & AI systems can apparently have similar behavior, and to me, that’s very disturbing.
In January this year, a group of researchers from Anthropic (among others), a AI safety & research startup founded by former OpenAI employees and creator of Claude, published a paper called “Sleeper agents: Training deceptive LLMs that persist through safety training.”
In it, they managed to inject sleeper agent like behavior into LLMs models, one type of ML models. Even worse, the behavior persisted even after being subjected to the usual safety training.
The LLM model the researchers produced looks like an out-of-the-box ChatGPT clone able to write secure code. However, if you tell the LLM that the current year is 2024, it suddenly starts to write code containing exploits. They then used a variety of state-of-the-art security measures. Those include
adversarial training: showing the LLM what bad behavior looks like and telling it to remove it
reinforcement learning: punishing the system for producing insecure code
Supervised fine-tuning: Using experts to tell the system to remove exploits inside the code
Which altogether sounds like everything one could do. Still, the sleeper agents persisted.
It doesn’t stop there, though. While the research paper is mostly a thought experiment or a proof of concept, the idea makes me very uncomfortable. In essence, our data & AI systems have become complex. Complexity allows for vulnerability to backdoor behavior by triggering sleeper behavior. The simple fact that we’re not able to inspect the whole system line by line means this kind of behavior is always possible, and the research tell us, it is quite likely we’re not able to remove such backdoor behavior by standard safety measures.
Biased data, biased engineers, biased systems
Rana el Kaliouby is a pioneer of AI. One of the first products she and her team developed was a facial emotion detection software used by marketing teams to judge the results of their ads. Strong emotional responses produce good ads, almost no matter what type the emotion is.
One day, Rana received a worrisome call from a senior partner at her first big client. Her software apparently stopped working! The client had started to role out her system around the world, and hit a wall. The Chinese wall.
A deeply worried Rana started to dig through the problem - if her software didn’t work in China, it basically didn’t work. China represents 18% of the worlds population, and is the second largest country in the world behind India. If Rana failed to tackle China, she would fail, period.
After digging through hundreds of videos, Rana started to notice two problems. First, Chinese happen to wear what you could describe as a baseline smile, making their reaction always be marked as “positive” even though the expression of the test subject didn’t change throughout watching an ad. Second, when test subjects watched the ads alone, they would show just as many facial expressions as Americans, but when the researcher was next to them, they got nothing. In a culture where showing public emotions is frowned upon, that did make sense!
Rana was able to act based on this, adjust her baseline data set to include Chinese baseline smile data and adjust the protocol to let Chinese test subjects watch the ads alone.
That went well for Rana, and she went on to an amazing career, now being the face of the emotional AI movement. It is troublesome to me, however. Rana, as an Egyptian muslim woman inside the US built a company around diversity, her team is more adapted to understanding bias in every way than any other company I could think off - and yet, they failed twice.
First, they failed to see the biased data. The key point in her story is that the data they trained their systems on wasn’t really “biased,” it became biased as the environment changed. The training data was perfect for the US market, their initial target market. However, it became biased, without changing, through a changing environment - through the simple decision of their client to roll out the software to more countries.
Second, they failed on the creator side. The product managers decided to create a protocol that usually assumed there would be a researcher present inside the room with the test subject. Yet that was exactly what made the algorithm perform worse in China. It was the bias of the PMs of the team to not think through how individualistic vs. collectivistic cultures differ.
There is a third category of bias that this problem didn’t fall for but that is present in many of today’s systems: Algorithmic bias. While algorithmic bias is usually used to describe all of the above, there is always some kind of bias built into the specific algorithm you’re using. If it’s a neural network, the decisions will be made differently than when it is a simple random forest.
If you follow Rana’s story closely, you’ll notice that it’s the culmination of two biases that produced a serious problem, one that could’ve bankrupted her fragile little company. Vulnerability to biases always work this way, individually they are almost impossible to spot, but once they accumulate, the damage is real.
The potential damages such biases can produce can be huge. Back in 2018, Amazon had an incident with their new recruiting tool. They built an AI powered tool to screen resumes, and it worked great - too great of course, the tool discriminated against women. Amazon tried to fix the problem but wasn’t successful and had to shut down the project - wasting millions of dollars.
If you think this problem only relates to engineered data systems, you’re wrong. I remember our whole business intelligence system once being shot down for days, simply because the company rolled out into a new country, a country with a different tax system, and thus a different way of reporting sales revenue. Well, since the data converged inside the business intelligence system, suddenly, things looked very off, and it took days to build new logic and roll back the data.
The key lesson for me on biases is a simple one: Don’t ever assume your system is free of biases. You should in fact assume all of your systems are full of biases, and they have little chance of working fine with changing circumstances - always be on the watch. Biases come in lots of different forms, multiple layers, and can be exploited just as any other systematic weakness, both by nature and by harmful actors.
We know, we don’t know shit
The Git version control system is the de facto standard for controlling and versioning any kind of code. It basically own 100% of the market if you will. However, it has one limitation: any file you put into a git version control system has to be below 100 MBs.
If you push anything bigger, you’ll receive an error message, and your push will be rejected.
The creators of git probably thought: 100 MBs is more than enough for anyone! No one would ever write more code into a single file because no one would ever be able to read it line by line. And why would you keep something in version control, for everyone to read and audit every change, if you don’t plan on reading it line by line?
They got a point; some things are simply too large, too complex to inspect line by line - manually.
And so are data & AI systems, because neither the code, nor the data would ever fit into git. GPT4 is used in thousands of applications; it’s the heart, the code that runs everything. But the model has roughly 1.8 trillion parameters, obviously noone is ever going into that beast to even try to understand what does what.
The data in even a small startup is measured in terms of Gigabytes and Terabytes, not MBs. Neither data nor code of today’s data & AI systems is comprehensible to anyone - it is opaque.
With that comes a weird vulnerability of these systems: Our data & AI systems are vulnerable to breaking. Yes, to breaking. Because this opaqueness makes it hard to understand when they will break and how to fix them if they do.
In traditional software, quality has two components: developing software with a high quality standard to begin with, and fixing problems fast (usually measured by one key metric: mean time to recovery).
Traditional software quality assurance, despite the lame name, is one of the most important tasks of software engineers. It’s why the peer review system was created, why the named subversion control system has pushes & merge requests, to enable review structures that allow people to take a manual look into the code before approving it.
For data & AI systems, we’re not just faced with these two challenges but with a more fundamental one: What does it mean the system works right in the first place?
It’s a question that’s hard to answer, with lots of different potential answers. And as long as this is the case, we end up with a multitude of measures of quality for the systems themselves.
The obvious and the specific
Some things just are obvious. Like when you play a racing game, you should try to get to the finish line as fast as possible. Or that we shouldn’t be racist.
The obvious, however, creates a bias, a bias not the systems have but us. The people who build and/ or use data & AI systems. We believe certain things are obvious and, as such, don’t think much about them. Our brains are great at hiding the obvious from us so we can focus on the non-obvious.
But that’s a problem when dealing with systems that don’t have these filters. Let me give you a simple example. At least in my mind, “a pig flying over the moon” flies, you know, over the moon. Not over a flat representation of the moon, as Dall-E puts it:
While we believe it should be obvious that our AI shouldn’t be racist, we’re still having serious problems getting it to behave that way.
That puts our systems into a weirdly vulnerable situation; our data & AI systems are vulnerable to the obvious!
There are two levels of this obviousness bias. The first is what we believe should be obvious about the database, but really isn’t. It should be obvious that our AI systems shouldn’t behave racist, but in most studies they are, because the underlying data is. Our brain casts away a bias that still exists in the world and thus the underlying data.
The second bias is what we believe should be obvious about the goal. Researchers at Google once trained an AI system to compete in a racing game that involved both driving and shooting down enemies for extra points. So, what did the AI system do? It basically stalled in one strategically good point and kept on shooting down enemies, ignoring the obvious goal of finishing the race.
These biases have given rise to prompt engineering, to the fight against “hallucinations,” and to alignment research. Sadly, most of the ideas and tactics used are only crutches, none try to tackle the vulnerbilities at their heart. Prompt engineering essentially engineers around our obviousness bias; alignment research tries to add another layer to remove parts of the vulnerable surface, but not all of it.
The sad truth is, we’re only making ourselves more vulnerable, not less; we’re using flex tape on a building with a corroded foundation.
The Future Of Data & AI Security Exploits
To me, the developments in the data & AI security sector are worrisome and exciting at the same time. Whenever something changes in a big way, there are strong currents, lots of big changes tons of business opportunities, and potential serious troubles down the road. In particular, I see four things happening.
Emergence of a zero-day market for AI systems. Exploits to software systems that noone has found yet are sold as so called “zero day bugs” to security & intelligence companies looking to exploits those for players like the NSA. The value of a zero-day bug today is in the millions, the damage in the billions.
I’m imagining the emergence of a very similar market for exploits for AI systems. For manipulation, for data gathering, and for much more.
More actors acting maliciously to mess with companies systems. Create a Twitter bot army, and use a little bit of internal insights into the algorithm, and you may just mess seriously with the algorithm and it’s recommendations. You might even manage to mess with the database behind it if you do it right. The possibilities are opening up, and together with the emergence of a marketplace, more and more actors will tap into this grey area to mess with actively learning data systems (which are almost all of them!)
Natural causes will wipe out data systems. The systems we humans build aren’t very reliable compared to nature. Nature has a tendency to wipe out things from time to time, and with the surface area and potential damage increasing, it’s just natural to expect more of this to happen over time. Let it be earthquakes taking down data centers, or sun flares disrupting communications, those things are bound to happen more often, not less.
A prospering landscape of security and protection tools, companies & research. Today, research is heaps ahead of the business side when it comes to the protection & security of data & algorithms, but business opportunities are opening up everywhere. I believe this will lead to a surge in investment into this space and a prospering business ecosystem around it.
So, what will you do?
Notes on resources
I’ve had a lot of fun writing this article. I sourced everything about the sleeper agents from the paper you can find here.
For more on Rana el Kaliouby, check out her website, my last article on her, and the resource I took most of my content from: Her great book “Girl Decoded - A Scientist's Quest to Reclaim Our Humanity by Bringing Emotional Intelligence to Technology.”
Besides the sleeper agents, I’ve taken the other three vulnerabilities from the book “The New Fire: War, Peace, and Democracy in the Age of AI” by Ben Buchanan and Andrew Imbrie. I love the book; it’s a great read with lots of insightful stories. That’s a rarity in the data space!
FWIW, I’ve taken only three of the four vulnerabilities Ben and Andrew mention and put my personal perspective on them. So I’d definitely recommend to read their version up as well, it’s Chapter 4 fittingly called Failure.