You don’t have to be a cybersecurity professional to realize data breaches are as common in today’s landscape as trees in a forest. The headlines are almost numbing as the pervasive nature of data and network incursions reaches epidemic proportions. But as the opening paragraph in Villius Petkauskas’s Jan. 29 on the website Cybernews, “There are data leaks and then there is this.”
The breaking story jumps right into the breach – or “collection” – stating “The supermassive leak contains data from numerous previous breaches, comprising an astounding 12 terabytes of information, spanning over a mind-boggling 26 billion records. The leak, which contains LinkedIn, Twitter, Weibo, Tencent, and other platforms’ user data, is almost certainly the largest ever discovered.”
This massive data leak has been dubbed the Mother of all Breaches (MOAB). While the hyperbole may be a bit excessive, the records that were compromised are no laughing matter. While the original owner of this treasure trove of information is unknown, data breach search engine Leak-Lockup admitted to being the data storehouse breached.
Trend Micro said that while much of the data is compiled from previous data leaks and breaches, the sheer volume of sensitive data is a great cause for concern, with threat actors potentially able to use the database for a wide range of cybercriminal activities.
Industry researchers reported that “The dataset is extremely dangerous as threat actors could leverage the aggregated data for a wide range of attacks, including identity theft, sophisticated phishing schemes, targeted cyberattacks, and unauthorized access to personal and sensitive accounts.”
So how has the cybersecurity world reacted to this monster breach and what are the possible security repercussions? SecurityInfoWatch.com Editorial Director Steve Lasky surveyed several industry professionals for their analysis of MOAB and here are their assessments. Taking part in our Q&A panel are:
- Nick Carroll, Cyber Incident Response Manager with Raytheon (an RTX Business)
- Joseph Perry, Director, Advanced Services Lead at MorganFranklin Consulting
- Justin Tomas, Director, Cyber & Operational Resilience at MorganFranklin Consulting
- Corey Nachreiner, Chief Security Officer at WatchGuard Technologies
- Omri Weinberg, Co-Founder & CRO for DoControl
- Tamara Kirchleitner, Senior Intelligence Operations Analyst at Centripetal:
Setting the Stage
However, before we dive into our question-and-answer section, Watchguard Technologies’ CSO, Corey Nachreiner insists on setting the stage and putting the incidents in perspective.
“I want to state that calling this event the “Mother of All Breaches,” as though it is a singular and possible new event is incorrect. There are very few new threats from this consolidated list. They are not from one attacker, nor one technical flaw. Rather, they are the aggregated results of tens to hundreds of different breaches and data leak incidents that happened over the last decade.
Some underground personality seems to have aggregated the data from numerous old breaches into one list. However, most of these breaches happened a very long time ago and are not a new threat to most users. If you have changed your credentials and were aware of the individual breaches that affected you when they happened long ago, you have nothing to worry about. At best, the only new takeaway, this aggregated packaging of old breaches can teach you how big the problem of security breaches can be over time. 26 billion records have been stolen by many companies, in many ways, by many attackers over many years. All the information in this list has been individually available on the underground for a long time. This is not new,” says Nachreiner.
SecurityInfoWatch: How did this happen to such a broad array of victims?
Nick Carroll, Cyber Incident Response Manager at Raytheon: The “Mother of all Breaches” is an interesting name. The dataset covers major services including Twitter/X, LinkedIn, Adobe, Dropbox, and more, but it isn’t inherently indicative of a new breach. All these services have had security incidents in the past.
For example, LinkedIn suffered from a data scraping incident in 2021 that resulted in over 500 million user records being leaked. Dropbox had an incident in 2016 that resulted in over 68 million usernames and passwords being leaked. Adobe had an incident in 2013 that exposed about 38 million accounts and another in 2019 that exposed an additional 7.5 million accounts. The list of large, classic breaches goes on. When these major services suffer a breach, those leaked records get collected and traded on the dark web and in breach search services.
These are often referred to as “collections” or “compilations” in dark web forums and users on these forums will bundle multiple breaches together to create larger and larger collections. There was another famous collection, like this “Mother of all Breaches” dataset, which was shared in 2019 known as “Collection #1”. It contained 773 million usernames and passwords in over 2,692,818,238 rows that had been gathered from thousands of smaller historical breaches that had been gathered from thousands of smaller historical breaches.
Joseph Perry, Advanced Services Lead at MorganFranklin Consulting: This breach is getting a lot of attention for its scale and appears monumental, but answering the question of "how?" punctures that awe. The reality is straightforward. A data breach aggregation company called Leak-Lookup incorrectly configured a firewall guarding one of their databases containing a massive amount of data captured from previous breaches. In other words, there may be some new records in the bunch, but the overwhelming majority of records discovered in this breach are records disclosed in previous cases and aggregated by Leak-Lookup. This isn't a breach, it's a collection of data from many breaches, nearly all of which are already public knowledge.
Corey Nachreiner, Chief Security Officer at WatchGuard Technologies: You might ask this if you think this is one new breach and list of stolen data. It is not. The data represented in this massive database is just the consolidation of tens, if not hundreds, of different security breaches that happened to different companies over at least ten years. This is mostly old breach data, repackaged in one big list. For instance, Adobe had big breaches in 2013 and 2019, Dropbox was breached in 2012, and Ashley Madison in 2015. Most of the content in this 26 billion account breach is all that old, leaked data, which has been floating around individually, and in other consolidated packs on the underground for years. A very small fraction of the contents here might be new breaches, but most of it is old leaks and old news.
Knowing that the question of “how did it happen” will be different for every specific breach. The generic answer to “How did so many companies have security flaws or risks that allowed a data breach,” is unfair to answer, because each situation differed. However, one might generally assume many companies did not invest enough time, money, and resources into their security practices, which resulted in a breach.
Omri Weinberg, Co-Founder & CRO at DoControl: A data breach this pervasive was likely an accumulation of multiple attacks and compromises over a significant period. It's like the old saying about how you eat an elephant - one bite at a time. This was a concerted effort over time to gather this scope of data from all these organizations and collate it into one area.
SIW: Why is this incident getting so much notoriety?
Carroll: This dataset leak does appear to include some new records, which could result in additional compromises of users from other services. However, that risk is somewhat lower due to the amount of historical information in the dataset. The notoriety and interest that’s been generated in this leak is largely due to its sheer size. This is a collection of billions of usernames and passwords that can easily be ingested into attacker tools to fuel password-spraying attacks against companies.
These password spraying and credential stuffing attacks are becoming more and more common, and the consequences are real. The recent breaches in the news of 23andMe and Microsoft are both reported to have started with password spraying or credential stuffing attacks. These kinds of large leak collections also fuel the underground leak ecosystem where people share and sell their personal collections of breaches and other data. The publicity from a leak collection like the “Mother of all Breaches” can potentially expose more people to the existence of those leak ecosystems, bringing in more participants with an interest in storing and sharing breach data which can in turn fuel more interest in breaches.
Perry: This isn't an attack. A security firm discovered a huge trove of records available on the open internet and believed initially that it was evidence of an incomprehensibly massive attack. In reality, as the Cyber News article acknowledges, Leak-Lookup quickly took credit for the records. Given their business model as an aggregator of breach records, this is interesting in terms of scale, but not particularly important in the greater cybersecurity picture. This incident is notorious only in the specific sense that it is getting significant media attention. Understandably, Cyber News would be so excited, that the drama of the discovery would massively outweigh the disappointing details. The reason the story has captured the public's attention, however, is mostly miscommunication and misunderstanding.
Nachreiner: It is a consolidation of leaked data from many breaches and many different attacks long ago. It should NOT be considered notorious or called “the mother of all breaches” as that suggests the victims should be worried about something new. Most of the data in this database/list was released and leaked on the underground years ago. The companies affected learned of many of these breaches and warned their customers years ago. The affected passwords and credentials (and any other lost data) should have been changed and handled years ago. It is a disservice to those customers to cause them to worry without need about a “new leak” when this is a consolidation of old leaks.
Is there anything worth mentioning about this new packaging of old leaks? Yes! Seeing how ten-plus years of many different companies’ data breaches will add up to 26 billion lost credentials is an important thing to know. It illustrates why all end users should follow credential security best practices (e.g., use long, unique, and hopefully random passwords at every site that requires an account, which is best managed with a password manager, and enable two-factor or multifactor authentication). It also underscores why we need to pressure the organizations that store our data and credentials to spend more time and money protecting that data. But besides those items, there aren’t many new takeaways in this consolidated leak.
Weinberg: The number of records that appear in the data as well as the broad scope of organizations and agencies who appear to have been compromised is what makes this so noteworthy.
SIW: What was the technical flaw?
Carroll: In this case, breach search service Leak-Lookup has admitted to accidentally exposing one of their servers that allowed the datasets that make up the “Mother of all Breaches” to be leaked. Leak-Lookup and similar data breach search services collect the information being shared on dark web forums and other places. The services then put it into a database that businesses and security researchers can use on the open Internet to track and confirm if any of their company users have been exposed to historical breaches. Leak-Lookup admitted on Twitter/X that a firewall misconfiguration resulted in the accidental exposure, and they’ve since corrected the issue.
Tomas: The original creator of the compiled data is still unknown, but initial information shows that a misconfigured server allowed unauthorized access to a data breach search engine around the start of December last year. Data was exfiltrated and was then placed on an open instance that was discovered by security research firms.
Nachreiner: This is not new, and this is not close to one attack or breach, which means it’s also not one technical flaw either. The root cause was different for each of these breaches, which happened to different companies over a long period.
If I were to hypothesize, I would presume the most common causes of the breaches were likely:
- Web application flaws in the company’s website (such as SQL injection) that resulted in a database leak.
- Credential-related attacks like phishing, spear phishing and more, which allowed the attacker to gain access to legitimate remote access into the victim’s company, thus finding and extracting data.
- A software vulnerability in a company’s publicly exposed servers, which gave an attacker elevated privilege to gain access to and steal this data.
- A configuration flaw, for instance, in some cloud storage medium, which allowed external threat actors access to data they shouldn’t be able to get.
Those are just a few of the most common possibilities. The specifics will be different for each case. The only commonality is that all the companies did suffer some event that gave an unauthenticated threat actor access to exfiltrate data they shouldn’t have been able to access.
Weinberg: It's unlikely there was one technical flaw that led to breaches at all these different organizations. This was clearly an attempt to gather a lot of information over time so it's likely multiple technical vulnerabilities were exploited, and there was almost certainly some social engineering involved in some of these compromises as well.
Kirchleitner: Specific details about the technical flaws that led to the individual breaches in MOAB are not clear. However, data breaches generally exploit weaknesses such as weak access controls and security vulnerabilities. The collection of data from numerous smaller breaches suggests a systematic approach to collecting and compiling data, exploiting a variety of vulnerabilities across multiple systems.
SIW: What is the profile and motives of the attacker and what are the ramifications of an incident this large?
Carroll: There are some good cautionary tales and lessons we can draw from this event. A primary lesson is that security baselines and cyber hygiene matter. Especially in our more interconnected and cloud services-focused world. The “Mother of all Breaches” dataset is now floating around to fuel more attacks because of a misconfiguration. Misconfigurations are one of the top breach vectors and they’re one of the most preventable.
IBM produces an excellent research report on breaches each year, and in their 2023 Cost of a Data Breach report, they found that cloud misconfigurations accounted for 11% of attacks. That means at least 11% of attacks are preventable through proper cyber hygiene including the usage of configuration and change management, changing default passwords, and restricting administrative permissions. Another lesson to take away from this event is to raise your organization’s general awareness of the types of attacks that breach collections fuel such as password spraying. This dataset will be weaponized for password spraying attacks, just as “Collection #1” and all the datasets that came before it have been.
Take time now to make sure you can detect these common attacks in the cloud and SaaS tools you use and on your network edge with VPN and similar remote connectivity solutions. Deploy multi-factor authentication for use with these services to cut down the risk of reused and stolen credentials. Train your end users on why password reuse is dangerous and how those old credentials that are being traded on the dark web could come back to cause breaches on your corporate network or even their personal banking accounts.
Tomas: Reputation and finances are always at the top of the list when it comes to motives. Bad actor groups need those funds to continue carrying out their attacks on victims, but gaining status within the hacker community by taking credit for breaches helps to feed their ego and standing within the community. The creator of this data set could likely be a data broker who was looking to sell access to this valuable information to the highest bidders.
Having access to 26 billion records in one place is going to make the reconnaissance phase of social engineering easier to carry out for bad actor groups. Credential stuffing, phishing, and spear phishing are just some of the tactics that could be improved by leveraging this previous and new information. An increase in account takeovers could be expected as poor password hygiene practices, like not using unique passwords or not regularly changing your passwords, are going to be more vulnerable to exploit.
Nachreiner: There is no single attacker. This does not represent one event. Whoever decided to aggregate and re-release this consolidated data was likely not even involved in the original breach events. It’s likely an underground person resharing old data by repacking it all together. This is NOT a large new breach; this was many breaches under 26 billion that add up to a higher summation.
The original attackers, who stole tens to hundreds of millions of accounts at once, wanted to sell the credentials to the highest bidders. Stolen account data has a value on the underground that depends on the company with which the account is associated and how privileged the account is. For instance, it could be an Adobe employee admin account, an Adobe employee exec account, a normal Adobe employee account, an Adobe partner account, or an Adobe customer account. Each account type has resold value to other cybercriminals, but the former are worth more than the latter. So, the original threat actors’ motives may have been different for every different breach, but in general, it’s to monetize the data you steal on the underground.
The motive of the person who repackaged this data was likely to extract additional value out of old breach data by making it appear new and sexy through aggregating it to 26 billion records. It seems to have worked, as many people are talking about this leak of old data as the “mother of all breaches,” when it’s mostly the combination of older leaks that have had the value sucked out of them long ago. For many victims of some of these old, 2010s breaches, if you didn’t change your password years ago, it’s already too late, and this new leak doesn’t represent any new threat.”
Weinberg: This data was likely being gathered and aggregated by a broker or group who plans to sell access to some or all this information. It was likely gathered by multiple individuals or groups, and it's not impossible to rule out nation-state actors for something of this scale. The ramifications are hard to know at this point, but besides the value of this much data as a commodity, there could be other possibilities, like using it to train an LLM for less-than-ethical purposes.
Kirchleitner: The ramifications of MOAB are immense: Individuals have an increased risk of identity theft, financial fraud, and privacy violations. Organizations may lose consumer trust and suffer from legal repercussions and financial loss. In terms of the cybersecurity landscape, the data leak highlights the need for stronger security measures and policies for both individuals and organizations.
Steve Lasky is a 34-year veteran of the security industry and an award-winning journalist. He is the editorial director of the Endeavor Business Media Security Group, which includes the magazines Security Technology Executive, Security Business and Locksmith Ledger International and the top-rated website SecurityInfoWatch.com. Steve can be reached at [email protected]