Skip to main content
The Daily Singapore

Singapore news, every day

News

Singapore's Digital Archives Push Hits Snag as Duplicate Image Problem Surfaces Across Government Platforms

A wave of digitisation projects has exposed a persistent data quality headache: thousands of redundant image files clogging public-sector databases and slowing down AI-driven services.

Share

By Singapore News Desk · Published 5 July 2026 at 2:58 am

4 min read

Updated 4 h ago· 5 July 2026 at 11:42 am

How we reported this

This article was generated by AI from the linked public sources. The Daily Singapore is independently owned and covers Singapore news free from advertiser or sponsor influence. Read our editorial standards →

Singapore's Digital Archives Push Hits Snag as Duplicate Image Problem Surfaces Across Government Platforms
Photo: Photo by Faheem Ahamad on Pexels

Singapore's accelerating drive to digitise public records hit a practical wall this week when technology teams across several government agencies flagged a growing problem with duplicate images embedded in shared databases — redundant files that are inflating storage costs, degrading search performance, and complicating the rollout of AI-powered public services scheduled for later this year.

The issue matters now because it lands at a particularly sensitive moment. The Smart Nation and Digital Government Office has set a year-end target for expanding AI-assisted citizen services across platforms including Singpass and LifeSG, both of which draw on consolidated image repositories for identity verification and document processing. Duplicate image records — which can number in the tens of thousands once scanning backlogs from legacy paper systems are processed — slow retrieval times and, in some cases, cause verification systems to return conflicting results.

Where the Problem Is Showing Up

The duplication issue has been most visible in two places. At the National Library Board's digitisation facility on Victoria Street, archivists working on the NewspaperSG expansion project discovered this month that a batch of scanned historical images uploaded between March and May 2026 contained a duplication rate estimated internally at roughly 12 percent — meaning more than one in ten images had been stored at least twice, sometimes in different resolutions. Separately, the Housing Development Board's document management system, which handles renovation permit drawings and flat inspection photos for estates from Tampines to Bukit Batok, is understood to have triggered an internal audit after storage consumption jumped unexpectedly in the second quarter.

Neither the National Library Board nor HDB has issued a public statement on the matter this week, and the figures cited above have not been confirmed in official releases. The Daily Singapore is seeking responses from both agencies.

The technical cause is not exotic. When multiple teams scan the same physical document — a common occurrence when both a regional archive and a central repository independently process the same batch — deduplication tools that should catch the overlap sometimes fail to reconcile files stored in different formats, such as TIFF versus JPEG. The result is a dataset that looks complete but carries significant redundancy.

What's Being Done, and What Residents Should Know

GovTech, which sits at Mapletree Business City in Pasir Panjang and serves as the central technology arm for Singapore's public sector, has been working with agencies since at least early 2026 on a duplicate image replacement framework — essentially a standardised protocol for identifying, flagging, and replacing or merging redundant files before they propagate further into AI training datasets. Progress has been uneven. Agencies with older content management systems require manual review pipelines that are slower and more labour-intensive than automated hashing tools available to newer platforms.

For ordinary Singaporeans, the immediate practical effect is minor but real. Users of MyInfo, the data platform that pre-fills government and private-sector forms using verified personal records, may occasionally encounter delays when image-heavy documents — such as scanned property titles or educational certificates — are pulled from repositories affected by the cleanup process. The delays are typically measured in seconds rather than minutes, but they become more noticeable during peak usage windows such as the 9 a.m. to 10 a.m. slot when CPF and HDB transactions spike.

The longer-term stakes are higher. Singapore's positioning as a regional AI hub depends substantially on the quality and cleanliness of the datasets that train and feed public-sector models. Duplicate images are not merely a storage inconvenience — they can skew model outputs, introduce bias in visual recognition tasks, and produce inconsistent results in document authentication pipelines. A deduplication pass that seems like mundane housekeeping is, in practice, infrastructure work that underpins the credibility of every downstream AI application built on top of it.

GovTech has not confirmed a timeline for completing the deduplication sweep across all affected agencies. Agencies are advised to implement SHA-256 hashing at the point of ingestion — a standard that prevents duplicates from entering the system rather than cleaning them out after the fact. Residents who encounter errors or unexpected delays on government digital platforms can report them through the Singpass app feedback function or by contacting the relevant agency helpdesk directly.

You might also like

Editorial picks

How did this story land?

Spread the word

Share

Have your say

Loading comments…

Sources

About this article

Published by The Daily Singapore

Covering news in Singapore. This article was generated by AI from the linked sources and was not reviewed by a human editor before publishing. See our editorial standards.

Spread the word

Share

See something wrong? Suggest a correction.

Daily brief

Enjoyed this? Wake up to Singapore news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily Singapore and accept our Privacy Policy. Unsubscribe anytime.

Before you go

Get the Singapore brief

The day's Singapore news in a 2-minute read. Free, weekday mornings.

No spam. Unsubscribe anytime.