Skip to main content
The Daily Singapore

Singapore news, every day

News

Singapore's War on Duplicate Images: What Officials, Experts and Key Figures Are Saying

From government digital archives to e-commerce platforms, the push to detect and replace duplicate images is reshaping how Singapore manages its visual data infrastructure.

Share

By Singapore News Desk · Published 5 July 2026 at 2:40 am

4 min read

Updated 5 h ago· 5 July 2026 at 10:15 am

How we reported this

This article was generated by AI from the linked public sources. The Daily Singapore is independently owned and covers Singapore news free from advertiser or sponsor influence. Read our editorial standards →

Singapore's information and communications sector is confronting a surprisingly stubborn problem: duplicate images clogging digital systems, inflating storage costs, and undermining the reliability of databases used across government services, healthcare and e-commerce. The issue has moved from back-end nuisance to boardroom priority, with technologists and public-sector voices growing louder about the need for systematic solutions.

The timing is not accidental. As Singapore deepens its push to become a regional artificial intelligence hub — anchored by investments through the National AI Strategy 2.0, which the government reaffirmed in late 2023 — the quality of training data has become a critical sticking point. Duplicate images in datasets skew machine-learning models, produce unreliable outputs, and waste compute cycles that translate directly into higher operational costs.

What the Experts Are Flagging

Researchers at the National University of Singapore's School of Computing have long pointed to data hygiene as a foundational requirement for any serious AI deployment. The problem is structural: organisations accumulate images across multiple platforms, departments upload the same assets independently, and legacy migration projects copy files without deduplication checks. The result is digital clutter that compounds over time.

The Infocomm Media Development Authority, which oversees Singapore's digital infrastructure policy from its offices at 10 Pasir Panjang Road, has included data quality standards in its guidelines for government agencies adopting cloud services. Industry practitioners speaking at events such as the annual Singapore FinTech Festival have flagged deduplication as one of the lower-hanging fruits in enterprise data management — technically straightforward but frequently deprioritised.

Stack, the Government Technology Agency's data centre facility in Jurong West, processes enormous volumes of digital assets for public-sector applications. Keeping those repositories clean is not a cosmetic concern. Storage costs at hyperscale facilities are measured in fractions of a cent per gigabyte per month, but when duplicate images number in the tens of millions — a realistic figure for agencies managing decades of scanned documents and public communications — the cumulative waste becomes material.

Platforms and Practical Pressure

The pressure is equally sharp in the private sector. Lazada and Shopee, both operating significant regional logistics and seller-support functions out of Singapore, maintain product image libraries running into the hundreds of millions of entries. Duplicate listings — where the same product image appears under multiple SKUs or seller accounts — degrade search accuracy and erode buyer trust. Both platforms have invested in perceptual hashing tools, a technique that detects near-identical images even when file names or metadata differ.

The Singapore Tourism Board, which manages digital asset libraries for campaigns promoting destinations from Orchard Road to the Marina Bay waterfront, conducted an internal audit of its image repository infrastructure in 2024. The exercise, referenced in its annual report, identified redundancy reduction as a cost-saving measure ahead of expanded digital campaign spending.

Healthcare is another pressure point. Singhealth, which operates the Singapore General Hospital at Outram Road among other facilities, manages medical imaging archives where duplicate scans — uploaded by different departments or during system migrations — can create patient record inconsistencies. The Ministry of Health's Electronic Medical Records exchange program requires participating institutions to maintain data integrity standards that explicitly address redundant file storage.

For organisations still working through the problem, practitioners recommend a three-stage approach: first, an automated audit using perceptual hashing or MD5 checksum tools to identify exact and near-duplicate files; second, a governance review to determine which version is canonical; and third, an integration checkpoint in upload workflows to prevent new duplicates from entering the system. Cloud providers including AWS and Google Cloud, both of which have Singapore data centre regions, offer native deduplication features that can be activated at the storage bucket level at no additional cost.

The broader lesson emerging from Singapore's digital sector is that data quality cannot be retrofitted cheaply after a system scales. Agencies and companies that build deduplication into workflows from the outset — rather than treating it as a clean-up task — are reporting measurably lower storage costs and faster model training times. The conversation has shifted from whether to act to how quickly the fix can be embedded into standard operating procedure.

You might also like

Editorial picks

How did this story land?

Spread the word

Share

Have your say

Loading comments…

Sources

About this article

Published by The Daily Singapore

Covering news in Singapore. This article was generated by AI from the linked sources and was not reviewed by a human editor before publishing. See our editorial standards.

Spread the word

Share

See something wrong? Suggest a correction.

Daily brief

Enjoyed this? Wake up to Singapore news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily Singapore and accept our Privacy Policy. Unsubscribe anytime.

Before you go

Get the Singapore brief

The day's Singapore news in a 2-minute read. Free, weekday mornings.

No spam. Unsubscribe anytime.