Melbourne's public cultural institutions are sitting on a problem that costs real money and real storage. Duplicate digital images — identical or near-identical files duplicated across servers through years of digitisation drives, migration projects and grant-funded scanning programs — have accumulated across major collections including the State Library Victoria on Swanston Street and Museums Victoria's repository systems at Carlton's Melbourne Museum. The scale of the redundancy, while not yet published in a consolidated audit, is broadly acknowledged by archivists and information managers across the sector.
The timing matters. Victoria's government has committed to expanding digital access to public collections as part of its broader creative industries agenda, and several institutions are mid-cycle on major server infrastructure upgrades. Clearing duplicate image files before migrating to new systems is not optional — it directly affects licensing costs, metadata integrity, and the accuracy of public-facing search tools that tens of thousands of Melburnians use each month.
What Other Cities Are Doing
London's Victoria and Albert Museum completed a structured deduplication program across its digital asset management system in 2023, cutting storage overhead and, critically, resolving a long-running problem where the same object photograph appeared under multiple accession numbers — confusing researchers and complicating rights clearances. Amsterdam's Rijksmuseum, whose Rijksstudio platform has logged more than 700,000 freely downloadable high-resolution works, built deduplication logic directly into its ingest pipeline so duplicates are caught at the point of upload rather than cleaned up retroactively. Toronto Public Library, which digitised roughly 5,000 historical photographs of the city between 2021 and 2024, deployed open-source perceptual hashing tools to flag near-duplicate images before they entered the public catalogue.
Melbourne's approach has been more fragmented. State Library Victoria uses a proprietary digital asset management platform, while Museums Victoria operates its own separate system. The two institutions share physical proximity in the Carlton-CBD corridor but have not, as of mid-2026, announced any joint deduplication framework. The City of Melbourne's own digital archive — which holds images related to council infrastructure, public art and events — is managed independently again through the council's internal IT directorate in the Hoddle Grid precinct.
The Cost of Inaction
Cloud storage is not cheap at institutional scale. Enterprise-tier archival storage from major providers runs at rates where even a 10 terabyte redundancy across multiple systems translates into thousands of dollars annually in unnecessary infrastructure spend — before accounting for the staff hours required to manually adjudicate duplicate metadata records. A 2024 report by the Digital Preservation Coalition, based in York, UK, found that unmanaged duplication was among the top five cost drivers in mid-sized cultural institution digital collections globally, alongside format obsolescence and rights management gaps.
The practical consequences in Melbourne are visible to anyone who has searched the State Library's image catalogue and found multiple entries for the same photograph of Flinders Street Station or Federation Square — each with slightly different crop dimensions or resolution settings from separate scanning runs. For a researcher, that is a nuisance. For a publisher seeking licensing clearance, it creates genuine legal ambiguity about which file version carries the authoritative rights record.
Amsterdam and Toronto both point to the same core lesson: deduplication works best as a workflow policy, not a cleanup project. Building the check into the moment of ingest — before a duplicate is ever catalogued — eliminates the exponential backlog problem. London's V&A took the retroactive route and it worked, but the museum spent approximately 18 months on the remediation across a collection of around 1.2 million digital records, according to the institution's published digital strategy documents.
For Melbourne institutions, the most practical near-term step is adopting perceptual hashing standards — already open-source and free to implement — as a condition of any new digitisation grant awarded through Creative Victoria or the Australia Council for the Arts. Several archivists across the sector have raised this informally. Whether it becomes formal policy before the next round of infrastructure upgrades is complete will determine whether Melbourne's collections arrive on their new servers clean, or carrying the same accumulated weight all over again.