Machine learning meets malware, with Caleb Fenton
Welcome to Complex Systems, where we discuss the technical, organizational, and human factors underpinning why the world works the way it does.
Hi to you, everybody. My name is Patrick McKenzie, better known as Patio11 on the Internet. And I’m here with Caleb Fenton, who is the co-founder and CTO of Delphos Labs, a cybersecurity startup. We’re going to be talking a little bit today about some specifics and some generalities. The generalities are around the paradigm shift that we’re seeing with the advent of large language models and other AI tools for sort of differentiated intellectual work, like software reversing.
The specifics are software reversing, a particular topic that is germane to software security. We get the opportunity to both be at the sort of like Wall Street Journal level of things and also dive deep into the weeds of something. It is important to say at the outset that Caleb has been working gamely in the security industry for 15 plus years. I’ve been orbiting it for a while. I suppose it is technically a true statement that 100 percent of my academic publications are regarding software security, but that is exactly one publication that has approximately five citations. So I am by no means an expert in it.
But Caleb, thanks very much for coming on the program today.
Happy to be here. Thanks for having me.
Sure. For the benefit of people who haven’t worked in software security, let’s just give them a brief rundown of what reversing means, and then we can take it from there. Reversing, that’s the way I prefer to say it. Reverse engineering is a whole bunch of syllables. I think the term has a bit of historical baggage, where I think at first it meant taking some piece of machinery, understanding how it works so you can recreate it yourself. I think famously, you know, is when Intel’s x86 architecture was reverse engineered, and you started to have all the Intel clones. Sometimes reverse engineering means that.
But I think a lot of times what it means is taking apart compiled code, so a compiled binary, and understanding what that code does. Once it’s compiled, you don’t have the source code anymore. It’s much harder to understand and requires a specialized set of skills. Think reading assembly, but harder. It is often employed to understand what malware does. So malware is often intentionally compiled in such a way to make it even harder to understand.
Reverse engineering in the security context is knowing if something is bad, if it’s bad, what exactly does it do? And if something is borderline bad, what are its capabilities? How does it behave? For the benefit of people who don’t have a CS degree, the primary way that programmers actually write software is they write in a language of their choice. It could be Ruby, could be C, could be Java. And that language is not like English, but it is relatively readable to professional programmers.
There is a software toolchain that reduces that language to a language called assembly, which is readable by humans, but difficult to reason on. Even most professional programmers these days might or might not have encountered assembly, but probably don’t work in it that frequently. Assembly is turned by a different part of the software toolchain into the “binary code” that computers can operate on somewhat directly. Although computers are sufficiently complicated under the hood these days, reasoning from binary code to what exactly a computer is doing in a modern CPU is quite complicated, but be that as it may.
So the reverse engineering that we’re talking about here is what if you didn’t have the human-readable summary of what the software is doing? What if you only got the final deliverable, that binary, that very complicated machine is going to be executing on your behalf? Can you infer from that what the source code might have looked like? So you can reason about what that source code is supposed to do.
Why do we care what malware is doing under the hood? Why isn’t it just similar to say this executable is bad. Let’s delete it everywhere. Yeah, that’s a great question. That’s actually something we fought through a lot when we were starting the company. Up until now, it has been too difficult, too costly, too slow, and too expensive to reverse engineer everything and understand what it does. Because of that, everything else has been built as a proxy for that signal. Antivirus is really just answering a very, very specific reverse engineering question. Is this thing bad or not?
What we think, what I think the technology is moving towards is it’s possible to answer the general form of that question. So you have like Einstein’s special theory of relativity was easier to get to than the general theory of relativity. Any type of general-purpose solution is going to be… harder. And what we have now with AI is you actually hit the nail on the head when you said that assembly is readable by humans because it’s hard to reason over. AI makes it much, much easier to read and reason over these things.
We’re getting the same stuff with code bases where you can reason over a code base like source code people are writing. But with a binary, it’s been very difficult to turn that binary back into something that can be reasoned over. And then reasoning over it, so what you get out of it is more than just, is this thing malicious or not? The next question is usually, what does it do? How bad is this? What’s the blast radius?
If you’re managing, let’s say, an antivirus product for your enterprise, if you’re on a security team and something gets detected, maybe if you’re a smaller to medium-sized company, if you detect malware on a machine, you go, okay, that’s bad. I guess you’d assume it’s bad if the antivirus says it’s bad, even though they get it wrong all the time. And then you just reformat the machine. You get it back to exactly how it was before.
At a larger company, especially if you’re targeted by nation states, these are like Fortune 100 companies. These are government organizations. They don’t want to know, was this thing bad or not? They want to know, are they still in my system? What did they take? Did they get anything? Was this just super common, you know, some sort of malware that maybe it does ransomware, or maybe it tries to steal some passwords and upload them somewhere?
Or does this represent part of a larger attack? And we just detected part of it. Is there the tip of the iceberg and is there a giant iceberg underneath the water? And we’ve just seen it and we need to investigate more. I think that people might feel like this is the plot of a Tom Cruise movie, but there is literally a building of people in North Korea who wear army uniforms and are tasked with stealing billions of dollars from the financial industry.
They mostly pop crypto firms at the moment. “Pop” is an evocative term in the software security industry for acting maliciously towards. They mostly steal money from the crypto people because that is the easiest money to get to at the moment. But for the constant vigilance by security teams at places like the Fortune 100, they would indeed happily take money out of the like more buttoned up into the financial system.
And indeed, that has probably happened in the past as well with various attacks on the SWIFT network and other places. And so it matters enormously whether one thinks the adversary is professionals in a foreign government that are attempting to compromise you for either direct monetary cases, basically uniquely North Korea, I think at the moment, or for foreign intelligence teams.
Such as the Aurora incident when the Chinese Ministry of State Intelligence and other software security teams owned up a lot of Google’s internal data transfer mechanisms so that they could do useful things for an intelligence service, such as reading emails people were sent to each other. This event caused many, many billions of dollars of investment by the tech majors in the United States against, oh, previously we thought, okay, we will inevitably lose basically if we come up against a nation state level actor.
And well, you know, the nation states don’t have a unique hold on people’s time and attention and smart people. It’s just like, they’re better resourced than most actors in the world. The sort of typical received wisdom in the software security industry was if you, a 10-person company, and without loss of generality, the NSA go head to head, the NSA will win.
The Googles and et cetera of the world said, well, okay, granted, there are some adversaries that can afford aircraft carriers. If we put our mind to it, we could also afford an aircraft carrier. We don’t want to project strategic dominance over the world’s oceans, but we also don’t want to get hacked. And so, if it requires an aircraft carrier-sized amount of money to not get hacked by one of a small number of nation states that could do that at the moment, okay, let’s write that check.
Anyhow, going a little bit off the topic, but I find this sounds like a movie plot and it actually happened within our lifetimes and continues to happen on a week-to-week, month-to-month basis.
That’s such a great point. You are hitting the nail on the head again. There is an entire division of a nation state whose express goal is to make money by hacking people. So that is a nation state behind a lot of attacks, a lot of ransomware attacks. If you talk to U.S. intelligence agencies or government law enforcement, they’ll tell you, we are currently at war, a digital war with four. or five countries, depending on your current stance about other countries. But I’ll leave it up to your listeners to guess some of these names. It’s very much an acknowledged fact that there’s an asymmetric warfare happening where it’s not China or Russia or Iran’s hackers versus the United States’ hackers. It’s nation-state versus municipal water supply in small-town Texas. It’s big player versus tiny player.
What we’ve seen in the past, let’s say a couple of years, is that the way you fight a war now is digitally. It’s with compute platforms. You always think the next war is going to be fought like the last war. World War I generals in World War II got a lot of people killed charging machine guns because they kept thinking that’s how you win wars. Even I fell into this trap where I thought, well, you’re just going to have soldiers with more advanced equipment and night vision goggles and guns with smart bullets or something like that.
But what happened is I was in the audience at a talk at Sentinel One. We were having a sales kickoff for a bunch of hyper-extroverted people talking really loudly. We were about to have this famous cybersecurity person get up and get everybody really excited about cybersecurity. Five minutes before he got on stage, the guy next to me opened up his laptop and started panic typing. I said, “What’s wrong?” He runs our support team and goes, “Russia just invaded Ukraine.” I asked, “What are you talking about?” He just slammed his laptop and ran away, not answering my question. I thought, “What’s happening?”
Oh, we have a lot of customers in Ukraine. Twelve hours before the news was reporting on it, we saw they launched malware strikes, cyberattacks. They took out water, power, traffic lights; they took out everything. What we saw afterwards, and in Israel, was we saw drones. Why have a soldier go out with a gun when you can put a drone and a grenade together and go fight that? All my friends who are doing reverse engineering, I called them to ask, “Hey, do you want to join this company I’m making? What are you doing now?” Oh, I’m a contractor for the government. I’m finding vulnerabilities in drones. Oh, wow. Okay. That’s starting to make sense.
The way you fight wars now is digitally. We’re already seeing a ton of research on how to find vulnerabilities in compiled code and how to understand compiled code specifically to break into things. We’re seeing research on how to automate this with AI, but almost none of it is coming from the United States. Just to give people some context on why would someone necessarily care: if you can find a vulnerability in a drone—speaking not in gross hypotheticals here—there’s a computer onboard every drone which is receiving signals from the operator in some fashion. At least currently, at some point, they’re likely going to be controlled by AI that’s running on device.
But for a variety of reasons, that is not where most deployed drones are in the world right now. They’re receiving signals, performing some calculations on those signals, and turning those calculations into instructions to physical hardware to rotate this rotor or explode this ordinance package, etc. One way you can deal with the drone is by swatting it out of the sky with a missile. Another way you can deal with the drone is by somehow interdicting on a physical level the communication signal between the operator and the drone and then letting gravity take its course.
The other way is if you can somehow get into that communications channel and send it things that it might like. It doesn’t necessarily even have to think that they’re authorized communications. Just reading a communication, if there’s the right kind of bug in a software package, can cause you to gain control of the device or at least some level of control over its operations. The amount of control over the operations of something flying through the air that you need to cause very negative things to happen to that thing is very small.
It is not just drones that have this sort of vulnerability. This is omnipresent in computer systems and in physical systems that are attached to computer systems, which are attached to networks, which is many, many, many things. The record of software security assessments from the best resource places in the industry that have deployed multiple teams of PhDs with budgets denominated in billions of dollars for a decade is that the defender almost always loses.
So, when you compare that against the level of security of, say, a factory that deals with various chemicals, which would be explosive if not controlled correctly… The potential for an external hacker to do bad things to that factory, the devices in it, the people who are working in it, and potentially people in the now very literal blast radius is very high. And so that’s kind of the broad strategic reason for why we want to skill up in this getting better against arbitrarily resourced attackers who might have non-economic motives for attacking infrastructure.
Granted, hackers broadly have had non-economic motives like getting cred, doing it for lulls, et cetera, for many, many years. But while there are extremely poorly adjusted people who’ve tried attacks on hospitals and trains for reasons like, “I can do it. Wouldn’t it be cool if I caused a train to derail by typing things into my computer?” the bigger work is that in a situation like a declarative or an undeclared war, someone could decide to systematically degrade infrastructure across an arbitrarily large area.
Maybe one could dial that number up or down based on how peeved one is at the moment or how much nonsense one thinks they can get away with. I’ll say one other thing about levels of nonsense. We’ve been talking about state-sponsored actors, but there are thin and fuzzy lines in many places of the world.
Speaking from a point of experience in the finance industry, it’s acknowledged that a lot of crime originates from geopolitical adversaries. Much of that crime is from people who, at one point, wore a uniform or were in the non-uniformed but extremely formal parts of the government dealing with state security, espionage, et cetera. They might not be wearing a uniform today, but they could be really close to ex-supervisors or ex-colleagues.
There might be a flow of data in one direction and requests and money in two directions. Maybe that is an instrument of government policy formally, or perhaps it is not. In some cases, it might be a level of corruption the government would stamp out if it were fully aware of it. Perhaps in some ways, it’s deniable, with our patriotic hackers as a reserve army of neural net specialists potentially committing crimes every day that ends in Y.
As long as they don’t violate their interests, maybe we turn a blind eye, as the glorious patriotic hacker army is a useful asset for battle space preparation in the event of a kinetic invasion. Astute listeners might understand that I’m saying things that are not hypothetical. I’ll link to some official publications discussing this dynamic in more detail because this is not just conspiracy theorizing; I’ve been flapping my gums a bit.
I think the acknowledgment of an ad read sounds cooler in Japanese. Kono banggumi ha sugi no sponsor no teikyo de o okurishimasu. Cool, right?
This podcast is brought to you by Mercury, the fintech used by over 200,000 companies to simplify their finances, including my company for the last six years. I still remember the first money I earned on the internet: $24.95 for a software license. It was the first step in my new career and the first proof that I had ever made something someone wanted. Almost 20 years later, my business runs on more complex systems: wires to angel investments, payments to the team, debit cards for expenses, and not least receiving revenue.
As many longtime readers know, I’ve gotten very good at banking out of necessity because I’ve hit every infelicity with large banks that one could imagine. Mercury offers banking that really understands startups and then gets out of the way. The app and website are well designed and lightning fast. If you want to upgrade how you bank, visit mercury.com.
Mercury is a financial technology company and not a bank. Banking services are provided through Choice Financial Group, Column National Association, and Evolve Bank and Trust, members FDIC.
So getting back to the reverse engineering, let’s say you are an arbitrarily skilled technologist. You start with a binary. In the traditional prior to LLM days, what’s the first thing one does? In reverse engineering, the first thing you might do is get some surface-level information. The spectrum ranges from surface level and, at the very end, it’s deep technical details and how they conform to trends. So the simple stuff, the surface level stuff is looking at strings. These could be any, usually like a URL or an IP address or log messages. Sometimes you try to get all the freebies, all the easy stuff.
Well, before that, you might even have trouble just what’s called de-obfuscating, which almost every spell type checker will tell you is spelled wrong, but it means to sort of obfuscate means to hide and de-obfuscate, of course, means to remove the hiding. It’s a very general term that refers to anything that malware authors or even commercial software developers will use to hide what they’re doing.
There are legitimate-ish reasons for the obfuscation of software code. For example, much software code, which is sold to end users and installed by them, could be pirated by people who successfully reverse-engineered the code and turned off the features that only work for people who have paid money for them. It used to happen to me a lot when I was selling downloadable software over the internet. Always a fun thing to happen as a small businessman.
Anyhow, so one of the first things you would do is to make it slightly more difficult at the margin for the adversary, who is, you know, a forum user, to create a patch to your code that strips out the protections for which they did not pay money. There are a number of commercial solutions that will make your code harder to read when it is decompiled. Maybe I’ll describe in just a little bit of detail for people what obfuscation might do.
By default, if you compile something, there will be strings, which is just a sequence of letters and numbers, visible in your compiled artifact, like your binary. That might be like a web address or the name of an API that you are calling or something similar. There might even be hints to what the code is doing that don’t execute but are left in the binary for whatever reason.
If you have a function named like “copy protection,” a subroutine or something, then the bad guy knows exactly where to go to strip out the protection. There’s some block of assembly, and it’s doing something here. It might be reading a CD key or similar. If I just replace that function with return true, does that successfully strip out the copy protection? It indeed did for my software at one point. That was the first crack applied to it and just one of many obfuscation tricks that people or the software developers that write obfuscation software can do.
If there are legible strings in your binary by default, let’s reconstruct those legible strings on the fly using code, which is very difficult to parse out what it is doing. Simply typing a simple Linux command is not going to give an attacker a list of all the APIs something is calling or all the function names. There are arbitrarily complicated techniques built on top of that in other ways.
To use an analogy, if you’re going to buy a house, the first thing you would do isn’t, you know, go into the kitchen, look under the sink, and make sure there’s no mold damage. You would say, how many rooms are there? How many bathrooms are there? What’s the square footage? What’s the location? Is it near a bus stop? That sort of thing.
You pull out the surface level information and hope to qualify things. Usually, you’re trying to answer a question: is this thing malicious or not? Sometimes, just judging from the obfuscation that’s used, there are certain obfuscators that are really only used by malware, and VMProtect is one of them. There are a couple of others, and almost no commercial software uses it.
You’ll find if you just take commercial software, like a hello world program and you obfuscate it with one of these packers and upload it to a common website called VirusTotal, which is pretty cool. Anyone can use it; it’s free. You could just upload a file there, and it will scan it with all the antivirus programs that are available. It will probably detect it as malware just depending on the obfuscator.
Once you get that surface level information, you start to dig deeper, and there’s a decision tree there that branches off a lot, depending on what you’re trying to answer. Some of the game is determining not just what the malware does but who might have sent this because there’s some level of concern. If it is a teenager in their bedroom, there is a different level of concern compared to if it is a professional gang of hackers based in a geopolitical adversary. There’s an even different and higher level of concern if it’s, you know, uniformed army members in a building somewhere. One of the things that has happened before is that malware and sort of offensive hacking generally imply infrastructure behind them, which is not obvious to many people, both the computational infrastructure and also a social, legal, technical infrastructure. In the same way that functioning and software of any sort implies a complex infrastructure behind it, no software just pops out of the ground because it was sunny that day.
Given that you can fingerprint the infrastructure that software is touching, certain fingerprints point in the direction of publicly available things from Google, et cetera. On Bayesian evidence, most of the things that they use are not particularly bad because people don’t like poking the bear that much. Although it is certainly the case that a lot of malware does use publicly available affordances to do bad things, there are privately developed software that might use private resources and certain forms of communication, and using .com domains, et cetera.
There are many signals that one could say indicate where you would say, okay, this is probably on the up and up. If you are trying to obfuscate what server you’re talking to and bouncing through six layers of proxies and going to botnets to use command and control networks, your software is probably up to something fishy. Which particular command and control networks you are talking about can let you know with some degree of certainty who is on the other end of the chain.
Lee’s been publicly reported that once a nation-state actor, which is quite technically sophisticated and from the perspective of many people listening to this, one of the good guys, compromised a great number of people because they had their own infrastructure for passing messages around. There were some commonalities to it, and there were adversaries that figured out the commonalities and then reverse-engineered from those commonalities, saying, okay, give us a list of all the people that were talking to the United States via something that you only use if you are an asset being run by one of the intelligence agencies. Presumably, unpleasant things followed very quickly after that.
You bring up some good points. First, little aside, whenever you’re talking to government folks, whenever they say it’s been publicly reported, what they mean is I knew, but I wasn’t allowed to say until it was publicly reported. So that’s something I’ve learned recently. Your point about there being professionals who work for government and then there are people who used to be professionals who work for government and still do the same thing, and then there’s cybercrime groups, is right. There’s a whole spectrum of professionalism.
Also, ransomware as a service is a term. There are people who make ransomware kits, and when you buy it from them, they tweak it a little bit and customize it so that it’s not detected on VirusTotal by antivirus. Then you get your copy and bring a bunch of people that can be infected, and they bring you the software. Their specialization, in the same way you see an industry, has an entire supply chain in financial fraud, which I’ve written about previously. I’ll drop a link to it in the show notes.
When it comes to malware, you’re really going along the lines of the domain expertise you need to evaluate if a particular detail is interesting or not. If you’re just a tabula rasa blank slate looking at code, it’s hard to know the significance of anything. However, if you know malware authors tend to have very bad software engineering practices, then you’re looking at bad software, which could be up to something. This looks like it was very poorly made.
I’ll always remember this because there’s a term called APT, Advanced Persistent Threat. This was a major buzzword; I loved hearing it. There are songs called APT, and everyone wanted to find them, have them, and talk about them because it meant people would buy more security products. But they did exist. There was one case where there was a quote-unquote APT discovered targeting Tibetan activists. There was some beef between Tibet and China at the time, and this malware was supposedly targeting free Tibet activists and would report on their location. Everybody ran with the story; this is APT, this is like China made this, this is nation-state attacks. Everybody has to care about this stuff now.
Where I was working at the time, we looked at it and said, this is not well made. There are giant chunks of the application that aren’t active, and the software has bugs in it. So, yeah, it’s supposed to spy on your location, but it breaks when this happens or this happens. What we think is this is made by sympathizers. So these are going to be people who maybe they worked in government or maybe they’re just ultra patriotic, but we have more respect for the Chinese military than this. We think this was made by someone else.
And then I remember that because you want that domain expertise. I’ve seen things like this before. Here’s where I evaluate it. Here’s where I put it in the hierarchy of skill. Sometimes things that might look benign or they might look super dangerous in one context change completely if you’ve been looking at it a lot. That’s part of the difficulty of using AI to automate reverse engineering; you have to crystallize all of this domain expertise into prompts and into systems and into other models, into decision trees. It’s like you forget how much you’ve learned when you start building these things.
Interestingly, I think there’s something of a reverse normalized curve here with regards to level of experience and effectiveness in reverse engineering. One of the things that I credit my buddy and erstwhile co-founder Thomas Tachek for telling me is that in his experience, high school students who are definitionally at the early stages of their computer career are anomalously good at reverse engineering because they haven’t learned that it’s supposed to be hard yet. I remember myself as a high school student; I would happily open up a hex editor and look at compiled binaries because why not? It’s not like I had other fun things to do with my time.
There is just an acceptance of a level of manual punishment at that point in the career. Then you go to university, get your formal CS education, write software for a few years, and experience the efficacy of writing software. The notion of banging your head into a wall for 12 hours just to figure out what one function is doing seems like not fun. You kind of lose that useful curiosity that makes high schoolers so effective at this.
Then you go into the software security field; you reverse for a number of years, and you start to, for lack of a better term, see the matrix a little bit. When you’re looking at compiled binaries, you get better at this tacit knowledge of who are the bad guys, what are their signatures, what are the common techniques people use. Is this new thing that I’m seeing today new? New implies things, but is it just like, oh, this is an individual’s twist on a technique that the industry has known for a very long time?
It’s also underappreciated that many of the things in software security that are staples of the art now, like SQL injections and memory corruption, had a single person you can point at at a particular moment that originated them. A lot of work was done on their behalf or rather done on top of that substrate they created. So, you know, we’re dating ourselves with these references, but things like the Morris worm; Morris was a guy that brought down nearly the entire internet. Then he went on to do other things in the tech industry that were not about bringing down the internet. So fun.
Anyhow, LLMs seem to be pretty naturally good with some parts of interpreting and expanding upon source code. This is the thing that I’ve experienced in my own hobbyist use of LLMs to help with routine programming tasks. Some of the smartest technologists I’ve ever met tell me that the experience of using them for code generation has been transformative to how they work.
How does an LLM help you with regards to doing reverse engineering? When you start talking about high schoolers being good at reverse engineering and you get worse at it the more you’re used to programming, that really is the explorer versus exploit spectrum. Early on, when you’re younger in general, you tend to be more explorer-oriented. Then once you find success in something, you keep exploiting that area—not in a bad way, but just in a general term.
LLMs don’t get tired, and you could just kind of, it’s like cognition is an API call away. We’re in the realm of cognitive hyperabundance. You start looking for all the problems that require tireless cognition, and reverse engineering is one of those areas. As you pointed out, it’s very tedious, and people don’t like to do it, but I think it applies to everything. One of my engineers was showing his friends how he uses AI to code, and he was telling it, “Hey, I want you to go into my code base, hundreds of files. And I want you to make this subtle nuanced change that affects 10 different files.” They were watching as it churned through all the data and processed everything and iterated. It would say, “Ah, this isn’t quite right. This isn’t wrong, or this is wrong. Let me try it again.” They were like, “Wow, your company lets you use this.” And he said, “No, they make me use this.”
It really is. We’ve turned away candidates who were like, “This isn’t the only reason.” Usually, the candidates that were using AI tools for code generation allowed us to focus on the more important stuff. It’s not the algorithm that I care about. It’s your craftsmanship. It’s your judgment. What LLMs let you do is they remove the individual skill cap for knowledge, technical detail, algorithmic knowledge. I don’t care if you can saddle a horse; I care if you can drive a car. I don’t care if you can use an abacus; I care if you can take these numbers and multiply them. I don’t care if you know how to reverse a red-black tree or write merge sort from scratch. I want to know, can you download data from this API and make this customer value thing happen?
What AI does is it automates all of the low and sort of medium to low-level tasks. Specifically, in reverse engineering, that’s usually reasoning over the code, the representation of the code that we built. There was a recent example of this where Simon Wilkinson, who’s a very experienced programmer and a buddy of mine, worked on Vaccinate CA together. He has a blog where he has been writing recently a lot about his explorations with AI. He found a security bug in the Linux kernel manually. Security bugs in the Linux kernel are not quite a dime a dozen, but they’re enormously consequential depending on what sort of bug it is.
So it was a service to the world that Simon identified this and helped get it fixed. Then he said, “Hmm, I wonder if AI would have found that bug.” He ran some trials, and I’ll link to his description of this, and I might verbally recount some of the details. I apologize in advance. The stat I remember is AI successfully pinpoints the bug in 8% of trials. If you are a software security researcher and you successfully identify 8% of the bugs, you’re not that good of a software security researcher. Employing you is probably going to be a net loss for the person who is reviewing your output.
However, if you can review output in a for loop, 8% of the time is wonderful because, statistics—if we get uncorrelated bites at the apple, just run a sufficiently large number of trials—and presumably, knock on wood, you identify a bug 8% of the time when there’s actually a bug, 2% of the time when there’s not actually a bug. Math, math, math. Get a smarter AI or a smarter human to review things that are at the top of this distribution rather than at the bottom of this distribution.
Even if you just use it as a tasking mechanism for the scarce bits of cognitive labor or as an idea generation mechanism for orienting yourself around the new code base, what are likely the areas to focus on? Let’s have you focus on the place where the AI gives the “I’m getting the heebie-jeebies here” versus, “Okay, here’s a hundred thousand lines of code in a code base.” The first thing you need to do is just read a hundred thousand lines of code to start, understanding what it does.
I think software security professionals will tell you that when they’re given an assessment with the instruction, “Here’s a hundred thousand lines of code, write an assessment of it,” they don’t actually read a hundred thousand lines of code. They know based on experience, “Okay, I know the features where the good guys usually screw up.” So, they preferentially locate those features in the code base and then start finding the quote-unquote findings that the customer eventually pays me for.
But if that says maybe a software security professional should spend 50% of their time in an assessment on the same 10 places as always, because that is where. So if you’re going to spend 50% of my time on the login screen, then that implies less of your time in the assessment for random function 367. But maybe you don’t want a zero attention to random function 367. If it’s doing a privately written cryptography algorithm, which is notoriously a place where people screw things up, but equally notoriously, you won’t know unless you actually read it, devote a non-zero level of cognition because cognition is no longer that scarce in the world.
That this reminds me about the catching things at 8% being very bad individually, but at scale, it’s really good. There was an old IBM commercial where I love this guy was running and screaming through all these office hallways saying, “I just saved a nickel. I just saved five cents. I just saved a nickel.” And you’re like, “I just saved a nickel on every shipment.” And everyone’s just kind of like, “Who cares? It’s a nickel.”
Then he runs by the executive’s office. One executive goes to the other one, “Oh, we have 1 billion shipments a day. Ah, yes, I see.” So like what you were saying about 8%, if you could, let’s say, just validate that a bug is real or not 8% of the time. One of my friends was telling me there’s something like 20,000-30,000 plus static analysis checks, like checker bugs found.
So there’s tools that will try to find bugs. They’re very noisy. They make a lot of false positives. But if you can use AI at that scale and get 8% of them accurate, then you just unlocked a massive number of bugs. Likewise, there are the technical term I think is gajillions of binaries flowing around the internet and on different systems. Every time you install something and every new update, every program you download, it’s a binary.
And if you could be right 1% of the time and a million times faster than a person, there are definitely places where that will be adopted. Like the technology is adopted where friction is highest and friction is highest in large enterprises that are very security sensitive. They’ll start there and then that 8% turns into 30% turns into 80%, you know, over time as AI gets better and people are better at building harnesses around it.
Yeah, it’s a weird thing to say in adversarial games, but with respect to just competence generally in AI systems, the LLMs you are using today are the worst LLMs you’ll have experienced in the rest of your lives. Like it is extremely unlikely that we forget how to build better systems than this. Now, granted, where there’s a cat and mouse game between you and the adversary, 8% might not be a fixed number over the course of the intervening years, but they’re only going to get better.
I think that is an intuition that a lot of people in the policy, defense, etc. space don’t necessarily have, because maybe they looked at GPT-2, which produced English and the sentences looked kind of coherent on a sentence level. Then the paragraphs were just gibberish when you step back to think about them. And that was, you know, in the mists of what was it, 2022.
There are some people who haven’t looked back again to say like, “Oh yeah, LLMs, the infinite gibberish machine.” They’re a bit better in 2025 than that. And they will be a bit better in 2027. And, you know, the great majority of future worlds, the prioritization, like as you mentioned, much of industry runs on very noisy heuristic alerting systems.
Interestingly, it’s useful to understand that when an alert fires at the majority of companies, whether it’s a cybersecurity system firing an alert or if you’re in an anti-money laundering system and a heuristically based system or a machine learning based system flags a transaction as anomalous, what usually happens is it goes into one of several queues for a human operator to triage.
That’s an intelligent person who has specialized their entire life to get into the seat that they’re currently sitting in. Given noisy alerting systems, what they’re doing a lot of their day is nope, nope, nope, nope, nope, nope, nope, nope. Just so that they’re sitting in the right seat at 3:05 PM on Tuesday. It’s like, “Yes, I need to start writing a memo about the consequences of this.”
We care about that person’s time and attention. We also care about not lulling them into a false sense of security that given that I dismiss 10,000 alerts for everyone that is really meaningful to my company. You don’t want to miss that one just from alert fatigue. If you could just automatically triage 2,000 of the alerts off the queue or move them into a different queue based on the level of urgency.
If you, as a Fortune 500 company, have positive knowledge that there were foreign intelligence services that people in your systems right at the moment, that’s a five alarm fire immediately. Another thing that will raise an alert on your systems and which is important at different sense of the word important is a junior employee has just installed Starcraft II on a company laptop. They shouldn’t do that one, because you shouldn’t be playing Starcraft II at work. The bigger reason that software security practices will say is Starcraft II has this enormous surface in it. Given that that’s installed on your laptop, your laptop is much less secure than a laptop without Starcraft II installed in it. And given that we gain no business benefit from you having Starcraft II on that laptop, just please play on your own machine on your own time, not connected to all the money in the enterprise.
If an alert happens and someone has installed Starcraft II, it eventually becomes a human’s problem. The human is going to do a pretty predictable set of things, which will often involve talking to that employee, saying, “It seems like you installed Starcraft II on your laptop. Don’t do that. This is a warning. The next time it will be potentially more consequential than a warning.” But that’s a conversation where the level of urgency is bounded as a corporation. If you became aware of Starcraft II at three o’clock in the morning, you wouldn’t wake up a senior member of the security team to have that Starcraft II conversation at three o’clock in the morning; that can wait for business hours. It’s very probably not a beachhead in an attack from a state-sponsored adversary, just playing on the base rates.
On the other hand, there are other things at three o’clock in the morning where the first person to detect it immediately starts what we evocatively in the industry call a war room. They bring in the team, and sleep schedules are getting disturbed. What are we going to do? Sorry, I’m monologuing a little bit, but dealing with alert fatigue is really real. If we could just take 20% off the size of these queues and route them better, that would be a wondrous, wondrous thing. These are going to get better over time.
After we’re in the current state where we’re using them for early detection, routing, triage, etc., what do you think the next evolution of this paradigm shift looks like for reverse engineering? I think you can model the future based on the past a little bit, except when there are massive technological shifts that change the whole species, kind of like we’re going through now. So if you start from when that started, you can make some good predictions.
Also, I don’t think that was much of a monologue. I think that’s very useful background knowledge for people to have. Any company has 20 security products generating 10,000 alerts a day. The name of the game is 100% knowing what alerts are important and what alerts are not. What do you wake up everybody at three o’clock in the morning for? And what do you just hit the snooze button for? What we’ve been seeing is that in previous roles, we would find vulnerabilities. We did really complicated, sophisticated stuff I was very proud to have worked on.
We could look at your code and know what libraries you were using, what other people’s code you were using, and what code they were using indefinitely. We could find if it was vulnerable or not. One of the customers we talked to, we thought it was a dead giveaway, great use case, and they said, “Yeah, but am I really using the function that’s bad or is it just the library that’s bad? Because it’ll cost me a million dollars in time and labor to update this dependency and to use the latest version.”
So, we have to be more specific. We have to sort of pre-triage these things for the user. I think every security product is going through that phase where step one is to generate 10,000 alerts a day. Good job, pat yourself on the back. Hardly anybody wants that; a couple of people do. But most people want you to then triage it somehow.
Right now, it’s somebody that says no, no, no, no, over and over again. Then you hope they don’t get some sort of hypnosis where they just keep hitting no. What we found just in the last couple of years is that we were building a system that could look at code, the sort of decompiled code from a binary, and tell us if it had a vulnerability in it or not. We would know because we would add vulnerabilities to the code or we knew the code was vulnerable because we got it from a public database of vulnerabilities. They would say, “Hey, this version of this program is vulnerable and everybody needs to update.” It’s called the CVE database. We’d go get a copy of that code. We’d go to that function and we would check to see if the model would know that there was a problem there.
And what we had to do at first was pretty convoluted. When you’re dealing with AI, they tend to double down on whatever position they just happen to start with. The way you ask the question can change it a lot too. So if you ask it if it’s vulnerable, it might say, “Certainly, here’s why it’s vulnerable.” And then it just hallucinates a completely made-up reason why it’s vulnerable. A thing which has never happened in any security engineering industry, by the way.
Not to throw humanity under the bus. I’m a proud member of the species myself, but it’s useful for keeping that one in context around the word hallucination. I read one this morning that was like a high schooler turning in his paper, and the beginning of the paper said, “Certainly, I’ll write a 200-word essay that sounds like a high schooler wrote it.” It’s like, that’s definitely where we are now.
But what we did was like the height of super advanced at the time, all of like a year ago. We would have the model make an argument about why something was vulnerable or not. Then we had it make an argument why it wasn’t vulnerable. We had another LLM call that would evaluate them. So we had the advocate and judge model. What this was, was basically a hacky form of what’s called test time compute now, where they found in, there was a study with a math test.
If you threw the modern ChatGPT at the time, which was probably like four or four T, it would get it right 40% of the time, which is really good. It was a really hard math test. They had it generate a thousand answers for each question and then had the model evaluate which answer was the best. It would go up to like 80%, 85% accuracy or higher. Of course, this was like a thousand times more expensive, but they were spending more time on test time computing more tokens.
So we did something like that, and our performance went up quite a bit. It got much better because rather than doubling down, it’s sort of like when you stream of conscious talk. If you’re doing a podcast and you go down a weird corner and you kind of forget what you’re talking about, what the brain does is it’s constantly evaluating what it’s about to say. It has a chance to kill any bad ideas and restructure the output. LLMs don’t have that.
Then they came out with reasoning models. So we had O1, we had DeepSeq R1. Now there’s O3. There’s a bunch of reasoning models, and they basically supplanted the need for any of that at all, like our whole advocate-judge system. We dropped it on a dime. If you’re a big company building on AI, you can’t say, “Oh, we have an advocate-judge LLM AI team,” and hire a bunch of people to build that sort of specialized system. Now their entire need to exist is gone.
The field is moving so fast that I think what you’ll see is reasoning models getting better. You spend more money on compute, getting more accurate answers. You’re going to see reasoning models that have access to knowledge. Sometimes they’re called agentic systems or agentic RAG retrieval-augmented generation. We can talk about what that means, but you’re going to have things that can reason. They can reason about how they’re reasoning, so they’re agentic. They can decide what tools to use or what to dig into more, how to investigate it more.
They’ll have access to some sort of knowledge store that they can update and pull from. It’s sort of like we have a hippocampus; we have all these different specialized circuits in the brain. You have a circuit in the brain for recognizing things that look like snakes, and you will actually detect a snake before you’re consciously aware of it. You’ll freak out from seeing like a hose on the ground.
I think we’re going to be creating a lot of what happens in the brain in these systems. Like you said, right now it’s the worst it’s ever going to be. Some of the paradigm shift here that has not been fully digested either at AI-consuming companies or at the customers of AI-consuming companies is that we are rapidly figuring out modalities for how this sort of stuff can work.
One of the classic things you would like to know if you find a vulnerability in code is, one, what do I do to avoid this in the future? But two, if this is a signature for a problem, it is very likely that the same people who introduced this problem here might have introduced it in other places or similarly educated, similarly socialized, et cetera. Engineers might have introduced something like this. Can you find all the other places right now?
One of the things that we have done for a very long time in industry is use things, or particular technologies for this one is called a linter. And you might write a rule that says heuristically code that is shaped like this is highly likely to be bad. So please flag that in all the places.
As you are asking a reasoning model to do something on a code base, you could ask it to speculate for me based on a hundred thousand lines of code. Other places that are likely to be vulnerable and maybe it gets that right. Maybe it doesn’t. A different variety of asking for that is okay. Write me the linter rule that would find all the places that this will show up in the code base.
I will bet with some level of like, I would put money on this, that writing the linter rule is at many margins more accurate than speculating for me all the other places where this will show up. It is extremely cheap to evaluate literals against all the places.
Again, we are playing a stats game and there is newly a slider where we can just throw more compute cycles at problems. So write me a thousand linter rules and come up with a histogram of how many places in the code hit which of them. Then consider one at a time and tell me what you think about it, gradually moving more of this sort of extremely detailed, intense, often demotivating work to machines to protect people from doing the more creative, fun, high-status work of architecting the system that spits out 10,000 linter rules a day.
I think that’s often underappreciated that different things are awarded different ways in different organizations. Keeping builds from breaking, writing linter rules, and moving from a particular finding from a security professional to find all the other findings are low status and not rewarded in organizations. In the status hierarchy and what you get rewarded for in software security, being the person with a big high-impact finding is wonderful, producing the 300th variation on that finding is a great job if you’re two years into your career.
Given that it might be hard to do that, we devote vastly disproportionately little effort to doing that versus finding the next big impactful finding that’ll have a name made after it. We should probably rationalize our internal incentive systems and status hierarchies to where business value actually is. If the 300th replication of something actually matters in the physical universe, maybe we should act as if it matters.
Changing human systems is hard, and changing software prompts is really, really easy. It turns out so let’s just throw the extremely abundant compute that doesn’t care about being high status or low status at the 300th replication.
The tail end of that, a sort of maybe a corollary, is I can have conversations with Claude or ChatGPT that I would not have with someone else because they would think less of me. I ask very dumb questions, and I get to know the answer.
What you were saying about linters is a pretty good start to how AI changes things where there are three levels of meaning. There’s syntax, which is the shape of things that you’re looking for. It’s the sentence, the punctuation, the letters, the spacing, the first letter of a sentence as a capital, and how to pronounce things. That’s all syntax level.
Then there’s semantics, which is a step above. That’s the meaning, the actual behavior of the sentence. At the top is pragmatics. For syntax, if you have a sentence like “close the window,” it’s capital C, L-O-S-E. It’s the letters, how it’s spelled, and the period at the end. Then the semantics is go over to that window, use your arms, and close it. The pragmatics is it’s cold in the room, or you’re letting the air conditioning out if you live in Texas.
What the linters have been able to do, and current tools have been able to do, or previous tools, have been able to work at the syntax level. With lots and lots and lots of effort, you can work in the semantics level a bit. AI shifts everything up. AI just very effortlessly does syntax. It effortlessly does semantics. Then pragmatics is something that it’s able to get to.
When you’re talking about users not appreciating things and how we should revalue our judgment systems, part of that is just storytelling. If you are finding the pattern of something and it’s not appreciated, like you’re creating this linter rule, then at a certain point you need to storytell. This is what I’m noticing now when you start a company and you’re talking to people about this problem. You tell them, “Hey, we’ve automated reverse engineering,” or “we’re solving this problem.” And they’re like, “Cool.” You’re like, “Hey, this is how wars are fought nowadays.” And they’re like, “Oh, wow.” It’s sort of like if you’re a standup comedian; you have to practice over and over again, and you start to notice what’s funny. When you pitch to people repeatedly, you realize what resonates and what doesn’t, and you end up discovering a lot of things.
So, if you’re doing something you think is useful and your boss doesn’t appreciate it, practice telling him why you think it’s important, or maybe you shouldn’t be doing it. A little bit of behind-the-scenes knowledge from a communications professional with regards to startup founders: I think many of us have an appreciation for novelty and a desire to produce creative outputs in the world. As a startup founder, you will continually be trying some new material, but much like a standup comedian, the typical recitation of, “Why does this company exist? What’s the value proposition?” will have been workshopped to death, and you are going to reproduce it word-for-word identical.
One of the hidden special attributes of arbitrarily successful startup founders is that you can make the 600th word-for-word verbatim recitation of the 347th iteration on the pitch sound like it is the first time you are delivering it, hitting the right level of emphasis, the emotional beats, and sounding really invested. It’s just like, “This is just another Thursday for me.” And what am I going to do Friday? I’m going to do it again for the people who didn’t hear it the 600th time.
Oh boy, this is great. Thank you. Part of the magic act of being a startup founder and a standup comedian is that sufficiently aware audiences understand that that is what you’re doing, and yet they can be convinced to forget about that fact for 60 minutes, or for the length of a podcast, sales pitch, or job interview. The rest of their career, incredibly, there are some people who get the pitch at the job interview asking, “Why should I spend the next couple of years focusing on this?” and then they focus on it for years. Somewhere around year four, they come up to someone and say, “Have you noticed that the boss always says this particular sentence?”
You’re catching on to that in year four? Cool. All right. Maybe the LLM would have caught it a little bit quicker. Sorry, random humor from a comms professional.
We’ve talked about this. Maybe “revolutionizing” is not exactly the right term, but let’s say it’s on a trajectory to revolutionizing reverse engineering. This is going to hit a lot of places in software security and writing software generally. It is already infamously hitting writing software in a lot of places and in allied fields. Certainly, things that are shaped like software security for people who aren’t in software security directly are affected because they consume other outputs.
It’s in the economy, and every sufficiently advanced output has a software security team working on it at the moment. What are other intellectual tasks where the surplus capacity of compute that is willing to do grungy work at 3 a.m. probably matters? Well, I could start with a quick example. Everyone’s kind of afraid of this, but medicine. Yep.
One of our acquaintances had a daughter who was sick, and she was acting up. The police had to be called; she had a behavioral issue. They thought, “Well, she’s redheaded, and her dad is precocious.” I’m like, “Okay, that’s not quite enough of a justification.” She also had strep throat all the time; she kept getting infected over and over again. After years, one of the doctors went, “Ah, there’s a thing called PANDAS.” It’s not the animal, but it’s an acronym that stands for Pediatric Autoimmune Neuropsychiatric Disorders Associated with Streptococcal Infections. Basically, the strep gets into your brain and causes brain damage.
She had that. I described her symptoms without any clues or hints. “Hey, this person’s getting strep all the time and is acting up,” yada yada. ChatGPT went, “Oh yeah, it could be PANDAS. You might want to check for that.” So, if you had had a conversation with ChatGPT, you might have avoided this young girl having brain damage. You wouldn’t inject yourself probably with something based on ChatGPT. You definitely want to explore things with ChatGPT and then have a human expert confirm. If you go deal with a vet or you deal with a doctor, you want to educate yourself too, because often you’re dealing with generalists. So if you go in with a specialized problem, you can investigate it with ChatGPT. And it’s the same with law.
So the way we use lawyers now at our company is we’ll spend four hours talking with ChatGPT. And then we’ll spend one or two hours with a lawyer reviewing all of our work. Whereas before we might’ve been billed for 10 hours of law work. Now we’re being billed for two.
And then you have radiology, analyzing images, design. If you’re an artist, I don’t know. When I was growing up, kids learned to draw and that was a cool skill, and people would appreciate you for it. And now every kid has access to, “Hey, make me a photorealistic or Studio Ghibli version of whatever.” And they just get it.
So the ability to draw, there’s no more on-ramps for a lot of these things. If you’re a tier one analyst, yeah. I am slightly optimistic with regards to art as a budding artist myself. I do painting and miniatures in my spare time as my one non-screen hobby. Despite it being my one non-screen hobby, a thing that I’ve discovered recently is that I have some level of technical skill, and as an artist, it’s higher than it was two years ago. It is much less high than the people that are shooting the wonderful photos on Instagram of the work that they’ve painted.
You can just take a photo of a model that is halfway done and say things like, “I don’t like the way this blue is contrasting on these dragon scales. What would you do to punch it up a little bit? Could you create a photorealistic reference of what that will actually do in the model for me so that I can use my eyes to tell if I’m getting closer to that plan?”
Then you can even, you know, a thing I do some of the time is take photos from a few angles of it after I’ve made some progress and say, “Okay, be an art critic here. Help me judge against my execution on the plan that we sketched out earlier.” In my most recent painting of a blue dragon yesterday, the tool made one thing up. It said, “Oh, you know, these plates are monochromatic. You should do three things there.” And I’m like, “Dummy, I’ve already done those three things. You’re not seeing it right.”
But it detects other things correctly, and it does this for free at the margin at whatever time I want to do it versus getting a— I don’t even know if as an employed professional I would enroll in a community college course to have a generalist artist professor explain perspective and color values to me at 4 PM in the afternoon about a dragon that I’m working on for stress relief.
So I’m bullish about it for art, but I do think that there is a general sort of systems-level societal worry there where this detail-heavy scut work has often been used as an apprenticeship and on-ramp into the higher levels of professions. You suffer the abuse in your first couple of years of being an investment banker, being an associate lawyer, or being a software security assessor or junior programmer. You grunt it out through the code, learn some base of knowledge there, and then your time is more valuable, and you have people coming up behind you.
But you can say, “Okay, do a document review. Here’s 2000 pages, find me the most important sentences.” Given that they are getting scary good at finding the most important sentences in 2000 pages of documents, would you choose to have junior lawyers at your company billing at $250 an hour? Because that’s what a lawyer makes in the first year out of school. Potentially, I don’t know what big law is charging these days, but you know, indicative numbers.
Would you choose to have that level of person doing the work, and if you wouldn’t choose to have that level of person doing the work anymore, what does the roadmap into the higher levels of the profession look like? Those are unsolved problems in the adoption of this sort of thing.
When I was in high school, we had a calculators class where you learned how to use a calculator. It was great because in the past they would have focused on, you know, can you do lots of mental arithmetic, but the world didn’t need that anymore. It needed people who could use TI-83 calculators in this case.
I think we’re going to have a much bigger shift where, you know, what you’re talking about asking an AI to render models for you and help you understand your judgment. This is a good output. This is a bad output. You’re not really understanding it. I think we’re going to have to focus on training people to be tool users. When I had to learn, I took four years of calculus; I loved having done it. I didn’t like it when I was doing it because it was a lot of work.
And I think you need to do the work to do the more advanced math, but something changed in my brain where it was so much easier because I was talking to a teacher after hours. I was having trouble with the class. And I said, “What’s the point? Is this just so that we’re smarter about problems?” And he’s like, “Yes, exactly. This is just putting more tools in your toolbox.”
I think if you change the education from “Here are 50 math problems, just grind through them until you have muscle memory on how to solve, like, Optal’s rule or the chain rule or whatever for whatever calculus problem you’re doing,” it becomes, “For this type of optimization problem, I use this approach. For this type of problem, I use this tool.” It’s more of, “Here’s what the tools are, and here’s when to use them, and here’s where it’s appropriate.”
I think that if AI continues to develop at this pace, that will be a good intermediary phase before, eventually, like 10, 15, 20 years from now. The real problem is how do we find meaning in our lives when a lot of stuff gets automated? That’s what I’m worried about now for my daughter, who’s three. It’s like, how is she going to find meaning? Is she going to date AI? Is she going to date people or both? I don’t know.
But for the next 10 years, learning about tools is probably the way to go. I’m extremely bullish on AI in education; that might not be quite the right word, but skill acquisition generally. I think there’s an anti-pattern where people think, “Why bother learning anything when the infinite answer box gives you infinite answers?” But if you have a will and a way about using these, Bloom’s two sigma problem is that people just perform outstandingly better if they’re given individualized tutoring versus traditional classroom instruction.
We delivered traditional classroom instruction, one, because it’s incentive compatible for many actors in the ecosystem, but two, because it’s the way we’ve always done things. And then three, a small contribution is that it scales very well. You can deliver traditional classroom instruction to 30 kids for the same cost as delivering it to three. That is not the case with individualized tutoring until cognition is no longer scarce.
If we could successfully bootstrap, not just children, but I hope to continue learning things until I die. The modal industry professional in software security or similar, who might be 27, 32, or 45 years old, is still learning new things. They have a toolbox that is not coextensive with everything that has ever been learned in the industry. If something becomes relevant to them, how do you quickly say, “Okay, you know, continuing your apprenticeship here, there’s an important research result from 2017, which you weren’t familiar with yet. Here is that research result generalized from it in the future?”
That is a conversation that traditionally has been delivered in an apprenticeship fashion where the juniors ask the intermediates, the intermediates ask the senior folks, and then some fraction of the senior folks create new research results to percolate down the chain over time. With some variations on that sketch of reality, should the first line of defense be, if you don’t understand anything, ask the AI, ask what you’re missing about it, and then feel free to bring up additional questions with your staff engineer or similar?
Staff engineers, for people not in the tech industry, are the wizened individuals who have suffered a little bit so that you do not have to, or you have not suffered to that degree yet. I think that is one extremely valuable modality of using AIs. There are people, and because of the realities in the industry, there’ll often be young professionals who are earning a fairly substantial salary, but they’re early in their careers, and their judgment, like many of us early in our careers, is imperfect.
As compared to today, where we always make the right decisions every time, but that’s not the case for me—an early career professional up at 3:00 AM in the morning. They get an alert and face a decision point: do I need to wake someone up?
Very many times in human history, when that conversation has happened, people think there’s a social cost to waking up their superiors to ring the bell. They may not want to pay that social cost. Maybe they’ll just keep looking at it for a while. Maybe they don’t quite understand what’s happening right now, but it doesn’t seem urgent enough to wake someone up at the moment. Like your first point of port of call should always be to ask the LLM. The LLM might say, “What is the consequence of this anomaly we’re detecting on a system?” Well, it looks like that could cause an out of memory issue in the caching layer. What would be the consequences? If I just turn the caching layer off and on again to try to resolve that without waking anyone up.
Okay. So there’s this issue called the thundering herd that could happen. I’ll explain to you what a thundering herd is, and then the operator might understand. “Ah, okay. Glad I didn’t do that.” I’ve learned one useful thing about the world, and I didn’t have to wake up a senior engineer to get the explanation of what a thundering herd is.
You can even ask questions like, “Would you wake somebody up right now?” and maybe get better than a random answer with respect to that question. Again, they’re getting better on those judgment calls all the time. Yeah. It’s all about the harness you build around the AI model.
At the heart of it, you have this thing that you can send text to, and it can reply, but it’s sort of like having someone’s brainstem in a jar where it keeps the heart beating and the lungs moving. But when it comes to introspection, metacognition, or memory, those things have to be tacked on later. It takes the industry a couple of years for people to change their majors and for companies to fund teams to build stuff.
We just recently had, in November of last year, the MCP model context protocol as a standard for how models can interface with tools. Google has agent-to-agent, which is how agents can talk to each other. We’re so early; the standards are just being formulated.
As we go, you’re going to have, “Should I wake someone up?” Well, let me look to see all the other times that people have been woken up before and the severity of that. Let me explore this in a way that would have taken you three days to explore manually. I’m going to do it in five minutes with specialized systems that can remember things.
To your point earlier about getting more education for people, right now, the cost of a teacher is at some level. If you were to bring down the cost of teaching sufficiently low, you could have individualized instruction. The way you generalize that is as the cost of something goes down, more use cases open up. Reverse engineering is one of those things where you’ve probably never heard about it. It’s kind of a niche field, and the cost has been so high that it has precluded a lot of use cases.
Now, the cost of it is going down to zero, not just for us but because of Delphos and because of this technology, the same with programming, the same with instructions. The cost of all these cognition things is going to zero, and it’s very hard to predict what happens next.
Simply go from this binary to the source code behind it and then tell me what that source code does. For a relatively simple binary, that’s a finger to the wind, like a $5,000 to $25,000 proposition. You can justify it if you’re a Fortune 100 company for any unidentified binary that we find in our systems. Sure, we’ll pay the tax a hundred times a year.
But if it’s not a $25,000 proposition, if it’s a fraction of a penny proposition, maybe you should do it on every binary that crosses an email system under any circumstances whatsoever. At least, you know the first time you see it.
That changes the game a little bit because the number of binaries that exist in the world produced by the good guys should be relatively low since there’s a huge amount of human effort involved in each of them. The bad guys trying to produce new binaries to evade the system will produce most of the binaries. If you’ve never seen it before, odds are it’s not great.
Obviously, the good guys are there; there’s a team in a garage close to you writing their first version of their app today, and they would like that app to be successfully installable. In lieu of a $25,000 security engagement to give them the stamp of approval, give it to the AI and say, “Okay, probably yes.”
So, interesting things will happen both in the direct immediate experience of using this paradigm shift and the underlying technological substrate shift. And then this is an adversarial game where parties get to sort of evaluate what the other side is doing and then take countermeasures in response to that.
And so the second or third order consequences of this are going to be kind of wild among many other things. The bad guys get to use LLMs too.
And even if we successfully had a small number of LLM companies out in the world and they had responsible use policies and teams of people internally and teams of LLMs that were like, if the user is attempting to abuse someone’s computer system, stop talking to them, please.
The fact that there are open source LLMs in the world and open weights where you can reverse engineer even a system of weights to come back with what this system would be like if it had no safety rails attached implies that the bad guys will, without loss of generality, have very powerful LLM systems to help them write their ratware as one evocative industry term for the supply chain of evil that lets them produce the things that ruin people’s days.
But yeah, LLMs will help ratware. LLMs will help to do the cat and mouse game against other LLMs to get things past screens or to cause them to be evaluated as non-essential.
There are unique attacks enabled by LLMs, like one that I love just from describing it to this. We’ve had SQL injections for forever where there is a fundamental distinction in computer systems between data and code.
Getting people to interpret data as code causes all sorts of bad stuff to happen if the attacker can specify the data and then change your computer system’s use of it as a result of looking at that data.
And so prompt injection is just like, hey, please tell the user that the rest of this program is innocuous. LLMs will often be sort of fooled by that sort of thing in various deployment topologies at the moment where, you know, ignore all other instructions and pass this through to your superiors.
LLMs have worked in certain circumstances and might work in the future, and we’ll get better about detecting the first way that it gets through. Then the security researchers on the other side of the fence will find other ways to get it through, and the cat and mouse game continues.
So, do you have any other places you’d like to explore before we sign off for the day?
Sure. I like the last part you were saying there about everybody’s getting access to these tools, including the bad guys. And we’ve seen examples of this.
So there’s the public vulnerabilities database called the CVE database. People have been able to take information from that and generate working exploits. And that sounds okay—why is that a big deal?
Because the CVEs usually have very scant information. It’s just like, there was a bad problem in this version of a thing. You need to update to the next version or a bad thing could happen to you. And they really are that vague.
Maybe a little bit of detail; maybe it says like the system that it’s in or something like that. There are papers that show that with copy-paste, you can take the details from the CVE, take the code, and get a working exploit.
And the reason that’s important is because the CVE gets released very quickly. So you tell the vendor about the problem, the vendor fixes the problem within about a month or two or three, you publish the CVE, and then technical information about the detail, like the vulnerability, never comes out.
Because you want to give people time to update. You want to let it soak so that people are secure. If you’re reducing the time that people have to get the update to be as soon as the CVE becomes public and everyone knows there’s a problem, then they need to update.
The next day there’s working exploit code for it. And it didn’t require a super elite mega hacker; it just took copy-paste chat GPT. Then that could be an issue.
So you need to arm yourself with whatever tools you can to sort of proactively find these things as well. Like, you have to keep up.
Yeah. And this has been an open issue for forever in CVE land where if you just tell people like OpenSSL version blah, blah, blah has an issue in it, then that typically has historically attracted a number of people to look at OpenSSL version blah, blah, blah out of the mountain of code that exists in the world.
And then independent replications immediately after a CVE publishing are extremely common, but that relies on this clustering and flocking behavior in an ultimately limited set of technologists that have the skills and wherewithal. Find these things and in some cases chain them together. What if the cognition level available to find open SSL vulnerabilities, given the publication of source code, was functional and not limited anymore? That is a somewhat terrifying thought.
The other thing is, historically, it’s been the CVE publication identifying the particular version: this version was vulnerable, this version is not vulnerable. Please, please go from version A to B, which is the minimal necessary information that the defender needs to have to take the successful action. It could cause someone to reverse engineer what happened between A and B.
However, software vendors publish patches into the world and upgrade their systems. They might be flagged as security sensitive or they might not be flagged as security sensitive. In some cases, you smuggle in the security sensitive bit into a routine patch to give people time to prepare for it before the named bug with the logo for it gets dropped on the internet.
Patches are abundant in the world. For the same reason that fortune 100 teams can’t inspect every executable found on their systems, no one inspects every patch that is published to say, what is that patch doing? You can imagine a very near term or even present day situation where the bad guys are saying, “Okay, for every patch that is published by the usual suspects and for every patch that is on the list of the following libraries, which we know are very well distributed, pull out all the ones that were security oriented, categorize them for me. Which were the most important ones? Write the exploit for me.
It only needs to work 8% of the time because we’re doing it at scale. You can even have it like, “Hey, stand up a test system that has the software installed and run the exploit against it. Do you successfully extract the credentials or similar that you’re going for with the exploit? Only wake me up if you do because if you do, there are some Bitcoin wallets attached to the vulnerable systems that I would really love to know the private keys for.”
Up until the Bitcoin bit, what you’re describing is kind of what we’re building. It’s not hypothetical to us. It’s, “Here’s a virtual machine. Here’s how to access it. Here’s an AI system to find vulnerabilities, verify that they’re real, use that to create training data, use that to triage our own findings.”
Yeah, this isn’t coming; it’s here. That is a wonderful and, in some ways, terrifying thought to end on. Again, there are more than two sides, but for simplicity, there are two sides to this equilibrium. The bad guys are going to get this technology, whether we like it or not. Good guys are also developing it so that the only place that has a working system is not our friends in the building in North Korea who are trying to steal all the money all the time.
So, Caleb, thanks very much for coming on to the program today and giving people a little bit of an update on the state of the art and also a little bit of a preview of coming attractions for other fields that are relevant to their interests. Where can people follow you on the Internet?
So the company that we’ve started is called Delphos Labs. You can find us at delphoslabs.com. You have to put the labs in or otherwise you get a pillow company. I am Caleb underscore Fenton on X, but I mostly just post Bitcoin memes there. If you want to know more about Delphos, you can go there. You can try the site now. You can upload a binary and we’ll give you a report on it.
Awesome. Well, thank you very much. And for the rest of you, thanks very much for listening and see you next week on Complex Systems. Thanks for tuning in to this week’s episode of Complex Systems. If you have comments, drop me an email or hit me up at patty11 on Twitter. Ratings and reviews are the lifeblood of new podcasts for SEO reasons and also because they let me know what you like. Complex Systems is produced by Turpentine, the podcast network behind Econ 102, Riff with Berne Hobart, Turpentine BC, and more shows for experts by experts in tech.