AI chatbots are making cybersecurity work much easier–but foundation models are about to revolutionize it

Fortune· Getty Images
In this article:

When generative AI made its debut, businesses entered an AI experiment. They bought in on innovations that many of them don’t quite understand or, perhaps, fully trust. However, for cybersecurity professionals, harnessing the potential of AI has been the vision for years–and a historic milestone will soon be reached: the ability to predict attacks.

The idea of predicting anything has always been the “holy grail” in cybersecurity, and one met, for good reason, with significant skepticism. Any claim about “predictive capabilities” has turned out to be either marketing hype or a premature aspiration. However, AI is now at an inflection point where access to more data, better-tuned models, and decades of experience have carved a more straightforward path toward achieving prediction at scale.

By now, you might think I’m a few seconds away from suggesting chatbots will morph into cyber oracles, but no, you can sigh in relief. Generative AI has not reached its peak with next-gen chatbots. They’re only the beginning, blazing a trail for foundation models and their reasoning ability to evaluate with high confidence the likelihood of a cyberattack, and how and when it will occur.

Classical AI models

To grasp the advantage that foundation models can deliver to security teams in the near term, we must first understand the current state of AI in the field. Classical AI models are trained on specific data sets for specific use cases to drive specific outcomes with speed and precision, the key advantages of AI applications in cybersecurity. And to this day, these innovations, coupled with automation, continue to play a drastic role in managing threats and protecting users’ identity and data privacy.

With classical AI, if a model was trained on Clop ransomware (a variant that has wreaked havoc on hundreds of organizations), it would be able to identify various signatures and subtleties inferring that this ransomware is in your environment and flag it with priority to the security team. And it would do it with exceptional speed and precision that surpasses manual analysis.

Today, the threat model has changed. The attack surface is expanding, adversaries are leaning on AI just as much as enterprises are, and security skills are still scarce. Classical AI cannot cover all bases on its own.

Self-trained AI models

The recent boom of generative AI pushed Large Language Models (LLMs) to centerstage in the cybersecurity sector because of their ability to quickly fetch and summarize various forms of information for security analysts using natural language. These models deliver human-like interaction to security teams, making the digestion and analysis of complex, highly technical information significantly more accessible and much quicker.

We’re starting to see LLMs empower teams to make decisions faster and with greater accuracy. In some instances, actions that previously required weeks are now completed in days–and even hours. Again, speed and precision remain the critical characteristics of these recent innovations. Salient examples are breakthroughs introduced with IBM Watson AssistantMicrosoft Copilot, or Crowdstrike’s Charlotte AI chatbots.

In the security market, this is where innovation is right now: materializing the value of LLMs, mainly through chatbots positioned as artificial assistants to security analysts. We’ll see this innovation convert to adoption and drive material impact over the next 12 to 18 months.

Considering the industry talent shortage and rising volume of threats that security professionals face daily, they need all the helping hands they can get–and chatbots can act as a force multiplier there. Just consider that cybercriminals have been able to reduce the time required to execute a ransomware attack by 94%:  they’re weaponizing time, making it essential for defenders to optimize their own time to the maximum extent possible.

However, cyber chatbots are just precursors to the impact that foundation models can have on cybersecurity.

Foundation models at the epicenter of innovation

The maturation of LLMs will allow us to harness the full potential of foundation models. Foundation models can be trained on multimodal data–not just text but image, audio, video, network data, behavior, and more. They can build on LLMs' simple language processing and significantly augment or supersede the current volume of parameters that AI is bound to. Combined with their self-supervised nature, they become innately intuitive and adaptable.

What does this mean? In our previous ransomware example, a foundation model wouldn't need to have ever seen Clop ransomware–or any ransomware for that matter–to pick up on anomalous, suspicious behavior. Foundation models are self-learning. They don't need to be trained for a specific scenario. Therefore, in this case, they'd be able to detect an elusive, never-before-seen threat. This ability will augment security analysts' productivity and accelerate their investigation and response.

These capabilities are close to materializing. About a year ago, we began running a trial project at IBM, pioneering a foundation model for security to detect previously unseen threats, foresee them, and empower intuitive communication and reasoning across an enterprise's security stack without compromising data privacy.

In a client trial, the model’s nascent capabilities predicted 55 attacks several days before the attacks even occurred. Of those 55 predictions, the analysts have evidence that 23 of those attempts took place as expected, while many of the other attempts were blocked before they hit the radar. Amongst others, this included multiple Distributed Denial of Service (DDoS) attempts and phishing attacks intending to deploy different malware strains. Knowing adversaries’ intentions ahead of time and prepping for these attempts gave defenders a time surplus they don’t often.

The training data for this foundation model comes from several data sources that can interact with each other–from API feeds, intelligence feeds, and indicators of compromise to indicators of behavior and social platforms, etc. The foundation model allowed us to “see” adversaries’ intention to exploit known vulnerabilities in the client environment and their plans to exfiltrate data upon a successful compromise. Additionally, the model hypothesized over 300 new attack patterns, which is information organizations can use to harden their security posture.

The importance of the time surplus this knowledge gave defenders cannot be overstated. By knowing what specific attacks were coming, our security team could run mitigation actions to stop them from achieving impact (e.g., patching a vulnerability and correcting misconfigurations) and prepare its response for those manifesting into active threats.

While it would bring me no greater joy than to say foundation models will stop cyber threats and render the world cyber-secure, that's not necessarily the case. Predictions aren't prophecies–they are substantiated forecasts.

Sridhar Muppidi is an IBM fellow and CTO of IBM Security.

More must-read commentary published by Fortune:

The opinions expressed in Fortune.com commentary pieces are solely the views of their authors and do not necessarily reflect the opinions and beliefs of Fortune.

This story was originally featured on Fortune.com