Melbourne's public institutions are sitting on vast libraries of duplicated digital imagery — the same photograph filed under different names, stored across multiple servers, indexed twice in the same database — and the cost of doing nothing is climbing. The City of Melbourne, which manages digital assets across departments including planning, heritage and tourism, has been working through a structured de-duplication review since early 2026, according to information published on the council's digital strategy page. The problem is not unique to Melbourne, but the city's response offers an instructive comparison with peers in Amsterdam, Toronto and Tokyo.
The stakes are immediate. Australian institutions collectively waste an estimated hundreds of millions of dollars annually on redundant cloud storage, though precise figures for Victorian government bodies are not publicly broken down by asset type. What is clear is that image duplication sits at the heart of broader data governance failures that have grown worse since the pandemic-era rush to digitise. State Library Victoria, which holds more than two million images in its digital collections, launched a deduplication audit in late 2025 as part of its broader digital preservation strategy. The library has not publicly disclosed how many duplicate records were identified, but the audit is ongoing as of this month.
What Other Cities Are Doing
Amsterdam's Stadsarchief — the city's municipal archive — completed a system-wide duplicate detection sweep in 2024 using perceptual hashing software, a technique that identifies visually identical images even when file names, formats or metadata differ. The archive publicly documented the project on its institutional blog, noting it recovered meaningful server capacity and reduced licensing confusion across departments. Toronto Public Library ran a similar exercise across its digitised photographic holdings in 2023, partnering with the University of Toronto's information faculty. Tokyo's National Diet Library has embedded automated deduplication into its ongoing digitisation pipeline since 2022, treating it as a standard step rather than a one-off remediation project.
Melbourne, by comparison, has approached the issue patchwork-style. Creative Victoria does not publish a unified digital asset management policy on its public website. The Melbourne Museum, operated by Museums Victoria on Swanston Street in the CBD, uses the KE EMu collections management system, which includes some duplicate-detection functionality, but staff from different curatorial teams have historically uploaded assets independently — a pattern common to large institutions that grew their digital operations organically rather than by design.
The Local Cost and What Comes Next
Cloud storage is not free. Microsoft Azure and Amazon Web Services, which underpin much of the Victorian public sector's digital infrastructure through the Department of Government Services' whole-of-government arrangements, charge by the gigabyte. Duplicate images compound those costs directly. A single high-resolution archival scan can run to 500 megabytes or more; a collection holding ten thousand duplicates of such files represents five terabytes of preventable expenditure.
In Fitzroy, the Gertrude Contemporary art space and nearby RMIT University's digital media facilities both manage image-heavy archives. RMIT's library has cited deduplication as part of its 2025–2027 digital collections roadmap, published on the university's library strategy page. That kind of institutional planning is exactly what Amsterdam and Toronto embedded years earlier.
The practical path forward for Melbourne institutions involves three things that peer cities have already normalised: adopting perceptual hashing at the point of ingest rather than retrospectively, assigning a single digital asset manager with cross-departmental authority, and publishing an annual data quality report so that the public — and funding bodies — can hold institutions to account. Tokyo's NDL approach of baking deduplication into the digitisation pipeline from day one is the model most digital archivists now recommend, precisely because remediation at scale is orders of magnitude more expensive than prevention.
For Melbourne, the window to get ahead of the problem cleanly is narrowing. The Victorian government's broader data and digital strategy, released in 2021 and updated in 2023, sets out frameworks for data quality but stops short of mandating specific deduplication standards for image archives. Until those standards are locked in, the city will keep paying — in storage costs, in staff hours and in the slower, harder-to-price currency of institutional knowledge that gets lost when no one can tell which version of a photograph is the right one.