Singapore is sitting on a vast and growing backlog of duplicate digital images embedded across public sector databases, property listing platforms, and heritage archives — and the window to act before the problem becomes structurally expensive is narrowing. The issue has quietly escalated over the past three years as digitalisation drives pushed volumes of scanned documents, HDB flat photographs, and identity-linked imagery into systems that were never designed to deduplicate at scale.
The stakes are practical and financial. Storage costs for government-linked data infrastructure in Singapore have risen sharply since 2023, and the Smart Nation and Digital Government Office has been under internal pressure to audit redundant data assets before the next infrastructure procurement cycle, expected in late 2026. Duplicate imagery is not a trivial subset of that problem — in property and planning systems alone, where every HDB resale flat transaction, every Building and Construction Authority submission, and every URA development application carries attached image files, redundancy rates in legacy archives can run high.
Where the Decisions Are Being Made
Three institutions are at the centre of what happens next. The Housing and Development Board, which manages records tied to more than one million flats across towns from Woodlands to Tampines, has been piloting automated deduplication tools within its resale portal since the second quarter of 2025. The Infocomm Media Development Authority, based at Mapletree Business City in Pasir Panjang, oversees the technical standards that would govern any cross-agency solution. And the National Archives of Singapore, housed at the old Hill Street Police Station building, faces a parallel challenge in its digitised photographic collections, where duplicate scans of the same physical print can occupy multiple catalogue entries.
The decision these agencies face is not purely technical. It is architectural. They must choose between a centralised deduplication engine — one that sits upstream of all contributing systems and flags redundancy before storage — or a retrospective model that cleans existing databases in batches. The centralised approach costs more upfront but is cheaper over a ten-year horizon. The retrospective model is faster to deploy but risks being outpaced by the rate at which new duplicates are created.
Private platforms are watching. PropertyGuru and 99.co, both of which host tens of thousands of active HDB and private property listings at any given time, already run their own image-matching algorithms to remove duplicate listings from the consumer-facing interface. But their backend archives, which retain removed listings for analytics purposes, have grown substantially. Neither platform has publicly committed to a deduplication standard aligned with government specifications.
The Pressure Points Ahead
The timeline is tighter than it looks. Singapore's Public Sector Data Governance Framework, updated in 2024, sets agency-level compliance reviews on a two-year cycle. Agencies that cannot demonstrate active management of redundant data assets — including image files — during their next review cycle face recommendations for remediation, which carry budget implications. For the HDB, whose resale portal processed more than 27,000 transactions in 2024 alone, each carrying multiple image attachments, the arithmetic on storage and retrieval costs is not abstract.
The most consequential near-term decision involves vendor selection. The government's Government Technology Agency, known as GovTech and headquartered on Maxwell Road, is expected to issue a tender for a whole-of-government digital asset management layer before the end of 2026. Whether duplicate image detection is baked into that tender as a mandatory requirement — rather than an optional module — will determine whether this problem gets solved at the infrastructure level or pushed back to individual agencies to manage piecemeal.
For anyone dealing with Singapore's digital property or public records systems in the coming months, the practical implication is this: submissions that include image files should conform to existing IMDA file-naming and metadata standards now, before new deduplication protocols are finalised, to avoid having archived records flagged for manual review later. The agencies have signalled that the cleaning process will prioritise high-volume transactional systems first. Heritage and archival collections are likely to wait until 2027 at the earliest.