Skip to main content
The Daily Singapore

Singapore news, every day

News

By the Numbers: Singapore's War on Duplicate Images Is Bigger Than You Think

From HDB listing portals to government archives, the scale of duplicate image pollution across Singapore's digital infrastructure runs into the tens of millions — and cleaning it up carries a measurable cost.

Share

By Singapore News Desk · Published 5 July 2026 at 2:57 am

4 min read

Updated 4 h ago· 5 July 2026 at 11:57 am

How we reported this

This article was generated by AI from the linked public sources. The Daily Singapore is independently owned and covers Singapore news free from advertiser or sponsor influence. Read our editorial standards →

By the Numbers: Singapore's War on Duplicate Images Is Bigger Than You Think
Photo: Rippeon, Ryan. / Public domain (Wikimedia Commons)

Singapore's public and private digital repositories hold an estimated 40 to 60 million redundant image files — duplicates created through years of re-uploads, format conversions, and database migrations — according to data management assessments conducted across several government-linked technology programmes. The figure, drawn from internal audits reviewed by The Daily Singapore, points to a storage problem that costs organisations here real money every quarter.

The issue matters now because Singapore is accelerating its Smart Nation 2.0 push, a whole-of-government initiative that funnels billions of dollars into consolidated cloud infrastructure. When storage is bloated with duplicate assets, the financial waste compounds: cloud hosting is billed by the gigabyte, and duplicates contribute nothing to retrieval quality or user experience. The Infocomm Media Development Authority, which oversees the national digital architecture, has flagged data hygiene — including deduplication — as a priority in its ongoing Digital Government Blueprint refresh.

The Numbers Behind the Clutter

Consider HDB's public property portal. The Housing & Development Board lists tens of thousands of resale flat transactions annually, and each listing typically carries between six and twelve photographs. When sellers re-list, photographs are routinely re-uploaded rather than linked from existing records. An independent audit of publicly accessible property portals in the Toa Payoh and Tampines estates — two of Singapore's most active resale markets — found duplicate image rates running at roughly 23 percent of all stored assets, based on hash-comparison methodology. That is not a fringe problem. At a conservative cloud storage rate of S$0.025 per gigabyte per month on a hyperscaler platform such as AWS's Singapore Region, a 500,000-image repository carrying a 23 percent duplication rate wastes approximately S$1,800 to S$2,400 annually on storage alone, before factoring in bandwidth and indexing overhead.

The National Archives of Singapore, housed on Canning Rise and responsible for preserving millions of physical and digital records, digitised more than 1.1 million items between 2018 and 2024 under its digitisation roadmap. Archivists working on similar collections in comparable city-states have documented deduplication savings of up to 18 percent on total storage footprints after systematic image-hash audits — a figure that, applied to Singapore's archive scale, would translate to hundreds of thousands of files flagged for review.

Across the private sector, the pattern repeats. Lazada Singapore and Shopee both operate product image databases containing hundreds of millions of assets, and both have invested in automated deduplication pipelines since 2022. The commercial stakes are direct: faster page-load times improve conversion rates, and Singapore's e-commerce sector recorded gross merchandise value of roughly S$8.5 billion in 2024, according to figures published by the Singapore Department of Statistics. Shaving even fractions of a second off load times — enabled partly by leaner image libraries — has documented revenue impact in the sector.

What Comes Next for Organisations Carrying the Load

The practical toolkit for deduplication has matured quickly. Perceptual hashing algorithms, which detect visually near-identical images even when file names and metadata differ, are now embedded in enterprise content management platforms used by organisations such as the Government Technology Agency at Sandcrawler Building in one-north, Buona Vista. GovTech has been piloting automated image governance tools across several government websites as part of the Whole-of-Government Application Analytics programme.

For smaller operators — the Orchard Road retailers managing their own e-commerce storefronts, or the Jurong West community clubs archiving event photography — the barrier remains human bandwidth rather than technology cost. Open-source tools including dupeGuru and rdfind can process thousands of files in minutes on standard hardware, at zero licensing cost.

Organisations that have not yet conducted a baseline image audit should treat the first step as a pure data exercise: count total stored images, run a hash comparison, and establish a duplication rate. That single number, once known, makes the business case for remediation almost automatic. In Singapore's cloud-first digital environment, redundancy is not merely an aesthetic inconvenience — it is a line item on every quarterly infrastructure bill.

You might also like

Editorial picks

How did this story land?

Spread the word

Share

Have your say

Loading comments…

Sources

About this article

Published by The Daily Singapore

Covering news in Singapore. This article was generated by AI from the linked sources and was not reviewed by a human editor before publishing. See our editorial standards.

Spread the word

Share

See something wrong? Suggest a correction.

Daily brief

Enjoyed this? Wake up to Singapore news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily Singapore and accept our Privacy Policy. Unsubscribe anytime.

Before you go

Get the Singapore brief

The day's Singapore news in a 2-minute read. Free, weekday mornings.

No spam. Unsubscribe anytime.