Skip to main content
The Daily Singapore

Singapore news, every day

News

By the Numbers: Singapore's War on Duplicate Images Is Reshaping How the City Manages Its Digital Archives

New data from government agencies and tech firms reveals the staggering scale of redundant imagery cluttering Singapore's public and commercial digital infrastructure — and the cost of doing nothing about it.

Share

By Singapore News Desk · Published 5 July 2026 at 2:36 am

4 min read

Updated 3 h ago· 5 July 2026 at 5:12 am

How we reported this

This article was generated by AI from the linked public sources. The Daily Singapore is independently owned and covers Singapore news free from advertiser or sponsor influence. Read our editorial standards →

By the Numbers: Singapore's War on Duplicate Images Is Reshaping How the City Manages Its Digital Archives
Photo: Photo by Saksham Vikram on Pexels

Singapore's digital archiving problem has a number attached to it: roughly 40 percent of images stored across government-linked content management systems are estimated to be duplicates or near-duplicates, according to figures cited in a 2025 infocomm audit framework published by the Infocomm Media Development Authority. That single statistic has quietly energised a push by agencies, property portals, and media organisations to automate the detection and replacement of redundant visual files before the city's data storage bills spiral further out of control.

The timing matters. Singapore is mid-way through its Smart Nation 2.0 drive, with billions of dollars committed to cloud migration and AI-ready infrastructure. Storing redundant image files at scale is not merely an aesthetic problem — it inflates storage costs, slows content delivery networks, and undermines the metadata integrity that AI training pipelines depend on. With the Government Technology Agency, known as GovTech, rolling out centralised digital asset management tools across ministries through 2026, the duplicate image question has shifted from a housekeeping footnote to a line item that budget officers are actively scrutinising.

What the Numbers Actually Show

The scale is worth spelling out concretely. A single mid-sized Singapore property portal — the kind serving listings across Toa Payoh, Tampines, and the Central Business District — can accumulate upward of 2 million listing photographs per year. Industry practitioners have estimated that between 25 and 35 percent of those images are duplicates introduced at the point of upload, when agents re-list the same unit under a different entry. Multiply that across the half-dozen major platforms operating here and the redundant file count runs into the tens of millions annually.

Storage is not free. On commercial cloud platforms priced for the Singapore market, object storage for unstructured data — the category that covers image files — runs at approximately S$0.025 per gigabyte per month for standard-tier access. A single high-resolution property photograph averages around 4 megabytes after processing. Ten million duplicate files at that size represent roughly 40 terabytes of avoidable storage, translating to a recurring monthly cost in the low six figures across the sector. Over a three-year contract cycle, that compounds into a material line on any technology budget.

The National Library Board's digital preservation arm, which maintains the NewspaperSG and PictureSG archives at Victoria Street, has been grappling with a related but distinct problem: historical image deduplication across collections digitised at different resolutions and under different scanning contracts over two decades. The challenge there is that perceptual hashing algorithms — the standard tool for identifying visually identical images — struggle with scanned analogue originals where lighting variation and paper yellowing create artificial differences between what are functionally the same photograph.

Automation Is Catching Up, But Slowly

The practical response from Singapore's tech sector has centred on perceptual hashing, SSIM scoring, and more recently, embedding-based similarity search using vector databases. Several local firms operating out of one-north's Fusionopolis cluster have built commercial duplicate-detection pipelines sold to media companies and e-commerce operators. The approach typically runs incoming images against an indexed fingerprint database, flags matches above a configurable similarity threshold, and routes flagged files to an automated replacement or deletion workflow.

GovTech's Whole-of-Government Application Analytics platform, which tracks usage metrics across more than 200 public-facing digital services, began incorporating image asset audit functionality in a January 2026 update. The target, according to documentation on the agency's developer portal, is to reduce redundant static assets across government websites by 30 percent before the end of the financial year ending March 2027.

For organisations that have not yet built automated pipelines, the practical starting point is an image audit using open-source tools such as dupeGuru or custom scripts built on OpenCV, followed by a deduplication pass before any cloud migration. Running that exercise before migrating to a new content management system — rather than after — avoids inheriting legacy redundancy at cloud-tier pricing. The math on that decision, at S$0.025 per gigabyte per month, writes itself.

You might also like

Editorial picks

How did this story land?

Spread the word

Share

Have your say

Loading comments…

Sources

About this article

Published by The Daily Singapore

Covering news in Singapore. This article was generated by AI from the linked sources and was not reviewed by a human editor before publishing. See our editorial standards.

Spread the word

Share

See something wrong? Suggest a correction.

Daily brief

Enjoyed this? Wake up to Singapore news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily Singapore and accept our Privacy Policy. Unsubscribe anytime.

Before you go

Get the Singapore brief

The day's Singapore news in a 2-minute read. Free, weekday mornings.

No spam. Unsubscribe anytime.