Skip to main content
The Daily Singapore

Singapore news, every day

News

Singapore's Digital Archives Push Tackles Duplicate Image Problem Head-On This Week

Government agencies and cultural institutions are moving to clean up years of redundant visual data as Singapore accelerates its smart nation digitisation drive.

Share

By Singapore News Desk · Published 5 July 2026 at 3:25 am

4 min read

Updated 4 h ago· 5 July 2026 at 11:42 am

How we reported this

This article was generated by AI from the linked public sources. The Daily Singapore is independently owned and covers Singapore news free from advertiser or sponsor influence. Read our editorial standards →

Singapore's Digital Archives Push Tackles Duplicate Image Problem Head-On This Week
Photo: Photo by Joerg Hartmann on Pexels

The National Archives of Singapore confirmed this week that it has begun a structured deduplication exercise targeting tens of thousands of redundant images across its digitised collections, a process that archivists and records managers say has become urgent as storage costs and retrieval errors climb alongside expanding digital holdings.

The timing is deliberate. Singapore's Smart Nation and Digital Government Office has been pressing statutory boards since late 2025 to audit their data estates ahead of a broader government cloud migration scheduled for completion by the third quarter of 2027. Duplicate images — scanned documents, heritage photographs, urban planning records — clog storage pipelines, inflate cloud costs, and, critically, surface the wrong version of a record when staff or members of the public search online portals. Getting this right before migration is cheaper than cleaning up after.

What Happened This Week

On Tuesday, the National Heritage Board, which oversees institutions including the Asian Civilisations Museum on Empress Place and the Singapore Art Museum on St Thomas Walk, circulated an internal technical brief outlining a phased approach to duplicate image replacement across its collections management system. The brief, details of which were described by a spokesperson from the board's corporate communications office without attribution to a named individual, calls for automated hash-matching tools to flag exact and near-duplicate files, followed by manual curatorial review before any image is permanently retired from the active database.

Separately, the National Library Board — which manages the NL Online portal drawing on holdings from the National Library on Victoria Street and dozens of branch libraries — is piloting perceptual hashing software that can identify visually similar images even when file formats or resolutions differ. The pilot, running through September 2026, covers approximately 1.2 million digitised items in the Singapore Infopedia and PictureSG collections. A board document published on the NLB corporate website states the pilot aims to reduce storage redundancy by at least 30 percent in the tested collections.

The practical stakes extend beyond government offices. Small businesses in Singapore — many of them e-commerce sellers operating out of Jurong East and Tampines industrial parks — face a parallel problem. Duplicate product images across platforms such as Shopee and Lazada can trigger algorithmic penalties that suppress search rankings. Digital marketing consultancy data circulated at a Singaporean e-commerce trade group meetup in June suggested that pages with duplicate image assets load on average 0.8 seconds slower on mobile connections, a gap that measurably affects conversion rates in a market where mobile commerce accounted for roughly 72 percent of Singapore's e-commerce transactions in 2025, according to figures from the Infocomm Media Development Authority's annual digital economy report.

How Organisations Should Respond

Records and IT managers watching the National Archives and NLB exercises say the lesson for any organisation managing large image libraries is straightforward: establish a single source of truth early, before collections scale beyond easy manual review. Automated deduplication tools have dropped sharply in cost — cloud-based solutions from vendors serving the Singapore market now start at around S$200 per month for collections under 500,000 files — but the more expensive step remains the human curation required after software flags potential duplicates, because context matters in archival work.

Government technology agency GovTech, based at Mapletree Business City in Pasir Panjang, has published guidance on its developer portal recommending that agencies adopt content-addressed storage principles, where each file is identified by a cryptographic fingerprint rather than a file name, to prevent duplicates from entering a repository in the first place. That guidance was updated in May 2026 and is available publicly on the GovTech website.

For the National Archives, the deduplication exercise is expected to run through December 2026. Users accessing the Archives Online portal may notice some image links returning temporarily unavailable notices during the review period as records are reconciled. The Archives has advised researchers to contact its Reading Room at Canning Rise directly if time-sensitive queries are affected.

You might also like

Editorial picks

How did this story land?

Spread the word

Share

Have your say

Loading comments…

Sources

About this article

Published by The Daily Singapore

Covering news in Singapore. This article was generated by AI from the linked sources and was not reviewed by a human editor before publishing. See our editorial standards.

Spread the word

Share

See something wrong? Suggest a correction.

Daily brief

Enjoyed this? Wake up to Singapore news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily Singapore and accept our Privacy Policy. Unsubscribe anytime.

Before you go

Get the Singapore brief

The day's Singapore news in a 2-minute read. Free, weekday mornings.

No spam. Unsubscribe anytime.