The problem did not arrive overnight. Across Melbourne's network of local councils, state cultural bodies and urban planning departments, the same photograph — a heritage facade on Gertrude Street in Fitzroy, say, or a mural on Hosier Lane — might exist in six different folders across four separate systems, each copy carrying slightly different file names, inconsistent licensing tags and no cross-reference to the others. The result: archivists waste hours, budgets blow out and, in the worst cases, the wrong image gets published or a correctly licensed original gets deleted.
Victoria's Public Record Office, which holds state government records going back to the 1850s, acknowledged as recently as its 2024–25 annual report that digital asset duplication remains one of the most persistent management challenges facing agencies that deposit records with it. The City of Melbourne, the State Library of Victoria on Swanston Street and Creative Victoria have all separately grappled with overlapping image collections as each institution digitised holdings at different times, using different standards, and without a shared identifier system.
How the Duplication Problem Built Up
The root cause is straightforward: digital storage got cheap before digital discipline got enforced. Through the 2000s and into the 2010s, councils and cultural agencies rushed to scan physical collections, but procurement of different content management systems — often chosen for unrelated administrative reasons — meant assets were siloed from the moment they were created. The City of Yarra digitised its heritage overlays using one platform; the City of Melbourne used another; the State Government ran a third for its own planning imagery. Nobody built a common bridge.
The problem compounded during the COVID-19 period. When institutions like the Melbourne Museum on Nicholson Street in Carlton and the Australian Centre for the Moving Image on Federation Square accelerated their digitisation programs between 2020 and 2022 — partly to maintain public access during lockdowns — uploads were prioritised over deduplication audits. A single photograph documenting a Federation-era shopfront might now sit in a council heritage database, a state planning register, a museum accession file and a departmental SharePoint folder simultaneously.
Metadata inconsistency makes automated cleanup harder. An image shot in 1987 on Flinders Lane might be tagged with a date, a location or a photographer credit in one system and carry none of those fields in another. Without reliable metadata, deduplication algorithms that match on file hash values will catch exact copies but miss re-saved, re-cropped or re-compressed variants — which, in practice, constitute the majority of real-world duplicates.
What Remediation Looks Like in Practice
Duplicate image replacement — the systematic process of identifying canonical versions, retiring redundant copies and updating all internal links to point at the authorised file — has become a formal project stream at several Victorian institutions. The State Library launched a structured digital collections review in late 2024, working through roughly 1.2 million image records it holds across its various cataloguing systems. Progress is slow: archivists estimate that meaningful deduplication of a collection that size, without automated tooling, takes three to four years of sustained effort.
There is also the question of what happens to links. When a duplicated image is retired without a redirect or a replacement pointer, every internal page, every external partner site, every embedded council planning report that referenced it breaks. The City of Port Phillip discovered this during a 2023 website migration, when retiring legacy image files caused broken display issues across dozens of heritage overlay pages that planners and residents relied on for permit reference checks.
The practical lesson coming out of these remediation efforts is consistent: establish the canonical version first, update every reference to it, then and only then retire the duplicate. Doing it in reverse order — deleting the copy before fixing the references — is the single most common mistake agencies make, and it is recoverable only if a backup exists and someone notices quickly.
For Melbourne's cultural and planning institutions, the next phase involves adopting persistent identifier standards — stable, format-agnostic codes attached to each image asset regardless of where the file is stored. Several institutions are now consulting with the Australian Research Data Commons, based in Brisbane but with Victorian partner nodes, about integrating those standards into local collection management workflows. The target, according to project documentation circulating among state agency archivists, is a baseline compliance framework in place before the end of the 2026–27 financial year.