
More than 16 billion leaked credentials have been identified by cyber researchers since the beginning of the year. While they assert the vast majority had previously gone unreported, critics argue there is insufficient evidence to suggest that the data is new.
Researchers at Cybernews report that, of the 30 exposed datasets identified, only one had been previously disclosed. This particular set of 184 million records, which included user logins for Apple, Google, and Meta, was reported in May.
“What’s especially concerning is the structure and recency of these datasets – these aren’t just old breaches being recycled,” Cybernews’ researchers stated in their report. “This is fresh, weaponizable intelligence at scale.”
Why some have questions about this data
The report fails to present concrete evidence that the compilation contains new or previously unseen data, according to BleepingComputer, nor does it provide samples. The site’s owner, Lawrence Abrams, contends that any websites from which the records were stolen were not compromised recently.
“There are thousands, if not hundreds of thousands, of similarly leaked archives being shared online, resulting in billions of credentials records released for free,” he wrote. “Many of these free archives were likely compiled into the massive database that was briefly exposed and seen by Cybernews.”
When the Internet Archive, the non-profit behind the Wayback Machine, was breached in October 2024, hackers claimed they had leaked the information of 31 million individuals. However, when experts investigated the breach, they found that 54% of the compromised data had already been exposed in previous incidents.
It is also possible that, within the 16 billion records, numerous duplicates or multiple entries linked to the same individuals exist. When 2.7 billion records from the background-checking service National Public Data were leaked in August, it was only several days after the incident that it was revealed just 134 million of those records were unique.
Personal datasets can be compiled by the good and bad guys
The largest of the datasets uncovered by Cybernews contained more than 3.5 billion records, supposedly linked to Portuguese-speaking individuals, while the smallest comprised around 16 million. Most of the records followed a uniform structure: URL, username, and then password. This is a format commonly used by infostealing malware to organise captured stolen credentials.
Such datasets are typically assembled by cybercriminals through various data breaches or harvested using infostealing malware, before being sold or exploited for phishing, identity theft, or unauthorised system access. Some are also released for free to gain credibility within cybercrime communities.
However, similar datasets can be created by security researchers to study threat patterns, test cybersecurity tools, or raise public awareness about system vulnerabilities. Because users often reuse passwords, a single set of credentials may offer access to additional services, including social media platforms, corporate networks, and banking apps.
Leaked records could originate from Telegram or cloud services
According to Cybernews, the datasets were exposed only for brief periods of time, primarily through unsecured cloud-based data repositories such as Elasticsearch or object storage instances. While this was long enough for them to view the records, it was not long enough to determine who was responsible for the leak.
They were, however, able to draw assumptions about where the data was scraped from by the names given to the 30 datasets. While many offered no real clues, with generic labels like “logins” or “credentials,” some were more descriptive, referencing the messaging app Telegram, cloud services, or indicating a Russian, Portuguese, or business origin. Some even referenced specific malware used to obtain the data, according to Cybernews.
The compromised data has not yet been added to Have I Been Pwned, the widely used breach notification service that allows people to check if their credentials have been exposed. Its creator is reportedly investigating the incident.
Unprotected databases are behind the world’s biggest data breaches
Unprotected databases continue to drive some of the largest data breaches in recent years. Last summer, nearly 10 billion passwords associated with the now-defunct social networking platform RockYou were leaked on a hacking forum. Credentials from RockYou have been circulating ever since the platform was first breached in 2009, when a hacker gained access to a plaintext file containing over 32 million user passwords.
Last month, a cybersecurity researcher uncovered an unprotected Elasticsearch database containing over 184 million records, including login credentials for Microsoft, Google, and Apple services, as well as government and corporate email addresses. The hosting provider, World Host Group, declined to identify the owner of the data but promptly disabled access. It remains unknown whether the database was accessed or downloaded by malicious actors before it was secured.
Cybernews researcher Aras Nazarovas observed that the growing number of exposed infostealer datasets appearing in traditional database formats suggests cybercriminals are becoming less reliant on Telegram groups to access stolen data.
To find out how to protect yourself from data breaches like this, read TechRepublic’s guides on safeguarding your online accounts and personal information and password managers.