
In the physical world we are busy cleaning up after ourselves, if we see someone throwing trash on the sidewalk, we get annoyed. But in the digital world we don’t see the trash we leave behind. Dark Data is the term for information that is collected, stored and organized, but never used. In the physical world it is called hoarding. Dark data is digital hoarding that costs the planet a lot.
How is dark data created?
Dark data is created by almost every digital interaction. Here are some examples:
• Customer logs and web analytics: Every click and activity is stored for future analysis that rarely comes to fruition.
• IoT & sensor data streams: Thousands of sensors in smart factories or cities stream data 24/7. Often, only 1% of this is used for analysis.
• Duplicate files: Multiple versions of the same presentation or document stored across different data centers (which some still call the Cloud)
• Legacy communications: Attachments in Slack, Teams, or old email threads that are never opened again.
• Redundant, obsolete, and trivial (ROT) data: Duplicate files and former employee accounts remain active “just in case.”
We never delete data
A worrying characteristic of dark data is its scale and duration. According to the Veritas Global Databerg report, an estimated 52% of all data held by organizations is “dark,” and another 33% is “ROT” (redundant, obsolete, or trivial). Only about 15% is identified as business critical.
Because cloud storage has historically been inexpensive, companies rarely implement deletion protocols. This data resides in “cold storage,” where it still requires constant power to servers and cooling systems. And hardware and software must be regularly upgraded to continue storage.
Digital smog: The CO2 comparison
Data may feel weightless, but the infrastructure to store it requires large amounts of energy.
- Power consumption: Data centres account for approximately 1% to 2.5% of global greenhouse gas emissions, a footprint that is now comparable to the aviation industry.
- The car comparison: Storing dark data alone creates an estimated 6.4 million tons of CO2 annually. To put that into perspective, research from Tessi and academic studies suggests that if the growth of unused data continues unchecked, emissions from dark data could scale to match the impact of 80 million fossil-fueled cars running for a full year.
Key statistics: Digitalization was estimated to account for 4% of global greenhouse gas emissions already in 2020, and with the rise of AI-generated data in 2026, this figure is under increasing pressure.
Why is this important?
Key Stats: Digitalization was estimated to account for 4% of global greenhouse gas emissions already in 2020, and with the rise of AI-generated data in 2026, this figure is under increasing pressure. Dark data is not only an environmental problem, it is also a security risk. Every gigabyte of dark data is a “blind spot” where sensitive information – such as PII (personally identifiable information) – can be hidden and represent a security breach.
To solve this, we need “digital decarbonization.” As experts at Loughborough University have written, businesses need to move from data hoarding to data hygiene. Deleting an old email or emptying an unnecessary cloud folder is a small but necessary action to preserve the environment.
And see Tech Monitors calculation about cost. We are talking about trillions of $
Documentation and further reading
- World Economic Forum: Dark data is killing the planet
- Tessi Research: The Challenge of Dark Data Guide
- Veritas: The Databerg Report on Dark Data and ROT
- Tech Monitor: 85% of data is useless to business, and cost trillions!
- Loughborough University: How digital waste is polluting the planet