Skip to main content
The Daily Singapore

Singapore news, every day

News

How Singapore's Digital Records Got Flooded With Duplicate Images — And What It Took to Fix It

A long-running problem in government and corporate data systems has finally pushed agencies and companies to act, but the road to cleaner digital archives was longer and messier than anyone planned.

Share

By Singapore News Desk · Published 5 July 2026 at 2:58 am

4 min read

Updated 4 h ago· 5 July 2026 at 11:42 am

How we reported this

This article was generated by AI from the linked public sources. The Daily Singapore is independently owned and covers Singapore news free from advertiser or sponsor influence. Read our editorial standards →

How Singapore's Digital Records Got Flooded With Duplicate Images — And What It Took to Fix It
Photo: Photo by Angelyn Sanjorjo on Pexels

Singapore's push to digitise everything — from HDB flat application records to hawker centre licensing files — has produced an unintended side effect: vast repositories bloated with duplicate images that slow systems, inflate storage costs, and quietly undermine the accuracy of public records. The problem did not arrive overnight. It accumulated across more than a decade of agency-by-agency digitisation drives that prioritised speed over data hygiene.

The issue matters now because the Government Technology Agency of Singapore, known as GovTech, has made 2026 a declared year of data infrastructure consolidation. Following the expansion of the Moments of Life platform and the integration of SingPass face verification into dozens of more agencies, the volume of image assets held across government systems has grown sharply. Duplicate photographs, scanned identity documents uploaded multiple times, and redundant property images in the Housing and Development Board's online portal represent a category of technical debt that costs real money and creates real compliance risk under Singapore's Personal Data Protection Act.

How the Duplication Happened

The roots of the problem trace back to the early 2010s, when individual ministries and statutory boards digitised their own paper archives independently. The Ministry of Manpower scanned work permit photographs on a separate pipeline from the Immigration and Checkpoints Authority's own biometric capture systems. The Urban Redevelopment Authority built its own image library for planning submissions. When agencies later needed to share data, files were often copied rather than referenced, creating multiple instances of the same image sitting in different silos across Hive, the government's central data platform.

Commercial operators ran into the same trap. Real estate portals serving the Orchard Road and Tanjong Pagar office submarkets discovered that property listing photographs uploaded by agents were being re-uploaded by landlords and then again by sub-agents, sometimes with minor cropping that defeated basic hash-matching deduplication tools. By the time platforms began auditing their libraries in 2024, some listings carried five or six near-identical images with slightly different file names and timestamps — each counted as a separate asset, each consuming bandwidth and storage billing.

The technical fix — duplicate image detection using perceptual hashing and, more recently, convolutional neural network similarity scoring — has existed for years. The challenge was organisational, not algorithmic. Agencies operating under different procurement cycles and different data governance charters had no shared incentive to clean house until the Personal Data Protection Commission began signalling in 2023 that holding unnecessary copies of personal images constituted a data minimisation failure under PDPA obligations.

Where Things Stand in Mid-2026

GovTech's Government Commercial Cloud, which houses services for more than 140 public agencies, introduced a mandatory deduplication audit requirement for image assets as part of its updated cloud onboarding checklist in January 2026. The change applies to all new workloads and to existing systems undergoing major upgrades. Agencies migrating to the Singapore Government Tech Stack are expected to demonstrate that image storage has been audited before migration is approved.

On the private side, the Infocomm Media Development Authority's AI Trailblazers programme, which has placed applied AI teams inside participating firms along Fusionopolis Way in one-north, has included duplicate image detection as one of the standard use cases for computer vision pilots since late 2024. Several logistics and e-commerce companies in the Jurong Innovation District have used the programme to clean product image catalogues running into the millions of files.

The practical result for organisations still sitting on uncleaned archives is straightforward: the longer they wait, the more expensive the audit becomes. Storage costs on commercial cloud are billed monthly, and image files — especially high-resolution scans and photographs — are among the largest contributors to bloat. Organisations that have completed deduplication exercises report meaningful reductions in active storage, though specific figures vary widely by sector and should be verified against individual audit reports rather than treated as benchmarks.

For individuals, the near-term implication is more reliable digital records — fewer cases where an HDB or SingPass profile carries conflicting photographs from different upload events. For IT teams across both public agencies and private firms, the work of cleaning legacy image stores is ongoing, unglamorous, and, increasingly, non-optional.

You might also like

Editorial picks

How did this story land?

Spread the word

Share

Have your say

Loading comments…

Sources

About this article

Published by The Daily Singapore

Covering news in Singapore. This article was generated by AI from the linked sources and was not reviewed by a human editor before publishing. See our editorial standards.

Spread the word

Share

See something wrong? Suggest a correction.

Daily brief

Enjoyed this? Wake up to Singapore news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily Singapore and accept our Privacy Policy. Unsubscribe anytime.

Before you go

Get the Singapore brief

The day's Singapore news in a 2-minute read. Free, weekday mornings.

No spam. Unsubscribe anytime.