Unstructured data has quietly become a whale-like burden on IT infrastructure, storage, and security resources. Large enterprises commonly house 20, 30 or 40 petabytes (PB) of unstructured data, which is primarily generated by applications and users and ends up locked in file data storage systems that consume a disproportionate share of IT budgets. This file data quagmire is also a significant risk, particularly in the face of growing ransomware threats.
For IT leaders, the rapid growth of file data is no longer a background issue—it’s front and center. Cost optimization and risk management have become top priorities in enterprise data strategies. File-level unstructured data tiering (a form of online archiving) has emerged as an innovative and effective solution.
Why File Data Now Commands CIO Attention
According to the 2024 Komprise State of Unstructured Data Management survey, more than half of CIOs—56%—identify cost optimization as their top data management priority. This makes sense when considering the nature of file data. Unlike structured databases, which consist of rows and columns, file data often comprises documents, images, videos, and logs that can be retained for decades, typically without clear data lifecycle policies. In addition, files can take up large swaths of prime storage space.
This long-term accumulation leads to multiple redundant copies: a primary version, a backup, and a disaster recovery (DR) version. This, of course, can triple storage requirements and associated costs. Ironically, much of this data is rarely accessed or “cold,” yet may be sitting on expensive storage devices in the data center or in high-performance file storage in the cloud.
At the same time, business leaders are realizing that file data is an untapped asset for analytics and artificial intelligence. When managed intelligently, this data can improve customer relationships, support product innovation and inform strategic decisions. But doing so affordably and securely requires a shift in how file data is stored and protected.
The Ransomware Threat to File Data
File data is particularly vulnerable to ransomware attacks because it is widely accessed across users, groups, and systems. This broad exposure means that even one compromised user account can lead to a widespread infection. Since file systems are often interconnected, ransomware can silently spread through the network before being detected.
Given the complexity and distributed nature of file data, it’s no surprise that it represents one of the largest surfaces for potential ransomware damage. Ignoring this exposure is no longer an option. A comprehensive ransomware strategy must include file data protection, and that’s where data tiering becomes crucial.
What Is File-Level Data Tiering?
File-level data tiering is a method of reducing the cost and risk of storing cold file data while ensuring a non-disruptive experience for data access and ongoing mobility. An unstructured data management system first scans files across storage and then identifies files that are no longer active. The timeframe for labeling or “tagging” data as “cold" can vary, ranging from three months for medical images and closed legal cases to one year for user documents.
The data management system then moves the tagged cold data from expensive primary storage (a.k.a. a network-attached storage or NAS-system) to an economical secondary location, such as cloud object storage, which has higher latency but a fraction of the cost per terabyte (TB). Unlike block-level tiering, which occurs behind the scenes within storage systems, file-level tiering operates on entire files and should deliver a transparent experience for end-users and applications.
Tiered files still appear in the same folder structure and can be opened like normal, but the actual data is stored elsewhere. This means a user may never notice the difference. This transparency eliminates the need to change applications or user workflows.
The Financial and Security Benefits
Offloading cold files from high-performance storage to a more cost-effective archive platform offers dramatic savings. By reducing the active storage, backup, and disaster recovery (DR) footprint, organizations can reduce their total storage-related costs by more than 70% annually.
Here’s an example based on a 1-PB environment:
- Cold Data Identified: 80% of total volume (819TB)
- Annual Storage Costs (Traditional Setup): $2.59 million
- Annual Costs with Tiering in Place: $770,000
- Annual Savings: $1.82 million, or 70%
Beyond cost savings, tiering also shrinks the ransomware attack surface. Since cold files are no longer actively stored in the primary environment, they are effectively removed from the reach of most threats. This shift can reduce the volume of vulnerable data by up to 80%, significantly lowering both the probability and impact of an attack. Additionally, immutable storage options can be used for cold data, preventing modifications and deletions and further enhancing ransomware resilience.
There is the added benefit that one can deploy a more extensive (and likely more expensive) ransomware solution, as it’s only needed for 20% of the data.
Tiering Without Disruption Boosts Productivity and Departmental Buy-In
Unlike traditional data migrations or storage refreshes, file data tiering operates quietly in the background, maintaining unchanged file access for users and applications, ensuring operational continuity while improving security and reducing costs.
Ideally, an unstructured data management solution can meet the organization’s needs, allowing IT to establish policies for identifying which files to tier based on factors such as age, usage patterns, file types, or ownership. However, there is even more power and potential cost savings when authorized departmental users review their own files and tag data sets that they deem ready for tiering.
This allows IT leaders to optimize tiering strategies collaboratively, maximizing cost savings to the larger organization--and departments specifically--if they are on a chargeback plan. Additionally, a non-disruptive tiering technique that enables direct, native access at the destination means that researchers, data scientists, and other data stakeholders can rest assured; the data they designate for archives will be readily available when needed for future AI or big data analytics projects.
In an era where CIOs and other IT leaders face pressure to be more efficient, avoiding costs wherever possible while mitigating security risks and preparing data for AI, there is a significant amount at stake. The practical idea of file-level data tiering offers a high-impact solution that can preserve the data estate for competitive advantage.
To review, this delivers:
- Cost Optimization: By removing inactive data from expensive storage environments and backup workflows and by reducing the amount of data to be protected from ransomware, you can save 70-80% on storage and backups annually.
- Cost Avoidance: By reducing the capacity of data stored on expensive primary storage, you avoid the need to purchase additional primary storage. In today’s market with rising prices due to tariffs, this is a nice additional insurance to have.
- Risk Reduction: By limiting the attack surface and exposure of sensitive or stale files, you have an expanded ransomware defense.
- Future-Readiness: By enabling the secure and cost-effective retention of data, you create valuable repositories for unstructured data, supporting AI and analytics initiatives.
Final Thoughts
File data has quietly become one of the most costly and risky assets in the enterprise. But with smart, automated strategies like file-level data tiering, CIOs can control dramatically reduce costs, boost security, and prepare their organizations for AI.