Singapore's public and private sectors are sitting on tens of millions of duplicate image files scattered across enterprise servers, cloud storage buckets and content management systems — and the bill for storing them is growing. A July 2026 audit trail reviewed by The Daily Singapore shows that duplicate digital assets, particularly images, account for between 20 and 35 percent of total file storage consumption across mid-size organisations that have undergone internal data hygiene reviews in the past 18 months. That figure, drawn from infrastructure assessments conducted at data centres along one-north in Buona Vista, points to a structural inefficiency that IT departments have long acknowledged but rarely quantified publicly.
The timing matters. Singapore's Infocomm Media Development Authority (IMDA) has pushed hard under its Digital Industry Singapore framework to position the country as a regional AI and cloud hub, attracting hyperscale facilities from Google, Microsoft and AWS to sites in Jurong and Tuas. As those platforms scale, the cost of disorganised storage does not stay flat — it compounds. A single terabyte of cloud storage on a standard enterprise tier costs roughly S$25 to S$40 per month depending on the provider and redundancy tier. Multiply that by the volume of duplicate image data accumulating across hundreds of government statutory boards, retail platforms and media companies operating out of Marina Bay, Raffles Place and the Central Business District, and the aggregate waste runs into millions of dollars annually.
Where the Duplication Hides
The problem is not unique to any single sector, but e-commerce and real estate have emerged as particularly acute cases in Singapore. PropertyGuru and similar platforms host listing photographs that are frequently uploaded multiple times by agents working from different devices, with no automated deduplication layer stripping out identical or near-identical image files before they hit the database. A single condominium unit in Tampines or a shophouse along Ann Siang Road can accumulate dozens of functionally identical hero images across relisted entries, each stored separately. Industry practitioners estimate that between 15 and 25 percent of image assets on large listing portals are exact or near-exact duplicates — a range consistent with global benchmarks published by Cloudinary in its 2025 State of Visual Media report.
Government digital services face a parallel version of the same issue. The Smart Nation and Digital Government Office (SNDGO), which coordinates digital transformation across ministries, has in recent years pushed agencies toward centralised asset management under the Government Commercial Cloud (GCC) programme. But migration from legacy systems — some dating to the early 2000s — brings legacy clutter with it. Image assets from old campaign microsites, public health advisories and heritage digitisation projects at institutions like the National Archives of Singapore along Canning Rise have in several cases been transferred wholesale, duplicates intact, rather than deduped at source.
The Cost of Doing Nothing
Deduplication is not a new technology. Perceptual hashing, a technique that generates a compact fingerprint for each image and flags near-matches even when file names or metadata differ, has been commercially available since the mid-2010s. Enterprise tools from vendors including NetApp and Veritas can cut image storage footprints by 20 to 40 percent on first deployment, according to product documentation from both companies. A mid-size Singapore retailer running its web infrastructure on AWS ap-southeast-1 — the Singapore region — could theoretically reclaim S$8,000 to S$15,000 per year in storage costs from a one-time deduplication pass, based on standard pricing tiers and a 30 percent duplication rate across 50 terabytes of image data.
The practical path forward is not complicated, though it requires organisational will to execute. IT teams at companies operating from tech parks in Paya Lebar Quarter or the Sandcrawler building in one-north should begin with a baseline audit using open-source tools such as dupeGuru or fdupes before committing to enterprise licensing. For government agencies, SNDGO's GCC migration checklist already recommends pre-migration data rationalisation — guidance that digital leads at individual ministries would do well to treat as mandatory rather than advisory. The deduplication window is narrowest at the moment of migration; once duplicate files are embedded in a live cloud environment, remediation costs roughly three times as much in staff hours as catching them at the door.