Melbourne's cultural institutions are sitting on a data mess years in the making. Across organisations from the City of Melbourne's own digital asset library to the State Library Victoria on Swanston Street, duplicate images — sometimes dozens of near-identical copies of the same photograph or artwork scan — have quietly consumed storage, distorted search results and complicated public access to archival collections. The push to systematically replace and consolidate those duplicates is now a live operational priority across the sector, not a future-state ambition.
The question of how we got here matters because the same structural failures keep repeating. From the late 2000s onward, major Victorian cultural bodies began independent digitisation programs, rarely cross-referencing each other's holdings or agreeing on a shared metadata schema. When Museums Victoria accelerated its digitisation push — the organisation holds more than 17 million objects across its network including the Melbourne Museum in Carlton — different internal departments uploaded the same high-resolution scans through separate workflows. The result was fragmentation baked into the foundation.
The Digitisation Boom That Created the Problem
The Commonwealth-backed Picture Australia initiative, which ran through the National Library of Australia from the early 2000s and later merged into Trove, aggregated images from dozens of contributing institutions. Melbourne-based contributors included Public Record Office Victoria and the Royal Historical Society of Victoria, based in William Street in the CBD. Each institution retained its own local copy while also pushing files to the shared platform. No deduplication protocol governed that transfer process. Files multiplied at both ends.
State government digitisation grants under successive Victorian budgets accelerated the upload rate without mandating technical standards. Between 2015 and 2022, the Victorian Government's Creative Victoria arm administered multiple rounds of digital preservation funding to arts organisations. Institutions competed for those grants individually, bought different content management systems, and generated duplicate assets at scale — the same image living in TIFF format in one system and JPEG in another, with slightly different filenames and no shared unique identifier linking them.
The problem became quantifiable once cloud migration began in earnest. Internal audits — the kind institutions do not typically publicise — found duplication rates running as high as 30 to 40 per cent in some collections when organisations moved from on-premise servers to cloud storage around 2021 and 2022. At those rates, a collection nominally containing 100,000 images might hold only 60,000 to 70,000 genuinely distinct assets. Storage costs mount accordingly: enterprise cloud storage billed at current Australian market rates of roughly $25 to $35 per terabyte per month means duplicate images are a direct, ongoing budget drain.
What Systematic Replacement Actually Involves
Duplicate image replacement is not as simple as deleting one copy and keeping the other. The complication is provenance: different copies carry different metadata, different licensing annotations and different usage histories. Delete the wrong instance and you sever the link between a public record and the digital asset attached to it. That is a legal problem as much as a technical one, particularly for institutions holding Crown copyright material or images subject to Indigenous cultural protocols.
The Australian Library and Information Association published updated guidance on digital preservation standards in 2024, and institutions along the Southbank cultural precinct — including the Australian Centre for the Moving Image on Federation Square — have been working through revised workflows since late last year. The practical approach involves perceptual hashing, a technique that generates a fingerprint for each image based on visual content rather than filename, then flags matches for human review before any deletion or merge occurs.
For smaller organisations, including community archives run by Melbourne's migrant cultural groups in suburbs like Footscray and Richmond, the resources to run even basic deduplication tools are often absent. Peak bodies including the Public Record Office Victoria have flagged that smaller custodians need direct technical support, not just written guidance, to work through their backlogs responsibly.
Institutions expecting to bid for the next round of Creative Victoria digital preservation funding — applications for which are expected to open in the second half of 2026 — would be well advised to treat duplicate image auditing as a prerequisite, not an afterthought. Funders have been explicit that demonstrated data hygiene is now part of the assessment criteria. The cleanup is overdue. The clock is ticking.