Melbourne's public digital archives are carrying tens of thousands of duplicate images — scanned photographs, heritage documents and council records stored multiple times across overlapping databases — and a coordinated push to fix the problem is now underway across three major institutions. The effort puts the city in direct competition with cleanup programs already running in London, Amsterdam and Toronto, with wildly different results depending on who you ask.
The duplication problem matters right now because Victoria's state government committed in its 2025–26 budget to accelerating digitisation of public records held at the Public Record Office Victoria, based in North Melbourne. More data going in means old redundancies compound fast. Archivists say duplicate entries inflate storage costs, slow search results for researchers and, in some cases, cause heritage-listed images to be misfiled or lost in the noise of identical copies.
What Melbourne Is Actually Doing
Public Record Office Victoria, which holds more than 100 kilometres of physical records and a growing digital collection, began a deduplication audit in late 2024 using hash-matching software to flag identical files. The State Library of Victoria on Swanston Street launched a parallel review of its Pictures Collection, which spans more than 800,000 items. The City of Melbourne's own digital archive, managed through its Cultural Collections unit, joined the effort in March 2026, focusing initially on post-1950 photographic records of the CBD and inner suburbs including Fitzroy, Carlton and South Yarra.
The three bodies are not yet sharing a single platform, which archivists privately regard as the central bottleneck. Each institution runs different content management systems, and a unified deduplication protocol — the kind Amsterdam's Stadsarchief implemented across its 750,000-image digital collection in 2022 — does not yet exist here. Amsterdam centralised its image metadata under a single open-source platform and reported a 23 per cent reduction in storage overhead within 18 months, according to a case study published by the International Council on Archives in 2023. Melbourne has no equivalent published benchmark yet.
Toronto's approach offers a different lesson. The Toronto Public Library and City of Toronto Archives ran a joint deduplication project between 2021 and 2023, covering roughly 400,000 digitised items. The project was praised for its public transparency — monthly progress reports were posted to the library's website — but criticised for its narrow scope. It addressed only photographs, leaving audio-visual and document duplicates untouched. Melbourne's current audit covers photographs and scanned documents but not moving image records held separately by the Australian Centre for the Moving Image in Federation Square.
The London Comparison
London's experience is the most instructive and, for Melbourne, the most sobering. The London Metropolitan Archives on Northampton Road completed a major deduplication sweep of its digital holdings in 2024, cutting its active image library from approximately 1.1 million files to around 890,000 after removing confirmed duplicates. The project took 28 months and required a dedicated team of four archivists working full-time alongside contracted software engineers.
Melbourne's institutions are not resourced at that level. Public Record Office Victoria and the State Library are coordinating largely through existing staff, supplemented by the deduplication software licences funded partly through a $2.4 million Victorian government digital infrastructure grant announced in November 2025. Whether that envelope covers the full scope of the problem remains an open question that none of the three institutions has answered publicly.
For researchers and heritage professionals, the practical stakes are real. A genealogist searching Trove — the National Library's aggregator that pulls from Victorian collections — can currently retrieve the same photograph three or four times in a single search, each copy filed under slightly different metadata. Fixing that requires not just deleting files but reconciling the descriptive records attached to each one, a manual process that no algorithm fully automates.
The next milestone to watch is a cross-institutional working group meeting scheduled for September 2026, where Public Record Office Victoria, the State Library and the City of Melbourne's Cultural Collections unit are expected to table a joint protocol proposal. If that meeting produces a shared platform commitment, Melbourne will move ahead of Toronto's siloed model. If it stalls, London's 28-month timeline starts to look optimistic by comparison.