Melbourne's cultural and government institutions are sitting on a problem most people never see. Duplicate images — the same photograph, scan or artwork file stored twice, three times, sometimes dozens of times across different digital systems — have accumulated across public collections to a degree that archivists and data managers say is no longer manageable without deliberate intervention.
The issue has sharpened in 2026 as several Victorian institutions simultaneously push ahead with large-scale digitisation projects, creating fresh pressure to clean up what already exists before adding more. The State Library Victoria on La Trobe Street, which holds more than two million digital objects, and the City of Melbourne's own digital asset management system are both understood to be reviewing their deduplication protocols this year, though neither has publicly announced a formal program.
Why It Matters Now
The timing is not accidental. The Victorian Government's Creative State 2025–2028 strategy, which funds digitisation and public access to cultural collections, has directed millions toward expanding what institutions can store and share online. More material entering the pipeline means existing inefficiencies compound faster. Archivists at institutions including Museums Victoria — headquartered at Carlton's Swanston Street precinct — have described the challenge in sector publications as one of the defining operational headaches of the current funding cycle.
Digital preservation specialists point to several causes. Legacy database migrations in the early 2010s frequently duplicated files without detection. Staff turnover means metadata standards applied inconsistently for years. And when multiple departments within the same institution photograph the same object independently — as happens with collection items that move between exhibitions — nobody always checks whether a file already exists.
The cost is real. Cloud storage for cultural institutions is not free, and the Victorian Government's whole-of-government cloud procurement arrangements, administered through the Department of Government Services, apply consumption-based pricing. Industry benchmarks suggest storage costs for large cultural collections running inefficient duplicate loads can run 20 to 40 per cent higher than necessary, though the precise figures for individual Victorian agencies are not publicly disclosed.
What the Sector Is Recommending
The Australian Library and Information Association held a professional development session in Melbourne in May 2026 focused specifically on automated deduplication tools, drawing attendees from councils including the City of Yarra and Merri-bek. The consensus from that forum, according to a summary circulated to members, pointed toward perceptual hashing — a technique that identifies visually identical or near-identical images even when file names differ — as the most practical first step for mid-sized collections.
The National Archives of Australia, which sets guidance that flows down to state-level institutions, updated its digital preservation framework in March 2026 to include explicit guidance on managing duplicate records. The framework recommends institutions conduct a baseline audit before any major digitisation tranche begins.
For local councils, the stakes are also bureaucratic. The Public Record Office Victoria, based in North Melbourne, sets mandatory standards for how Victorian public bodies manage digital records. Duplicates that sit across different record series can create compliance headaches when material is subject to Freedom of Information requests — particularly if two copies of the same image carry different metadata or access restrictions.
Practitioners in the sector suggest any institution starting from scratch should map its storage landscape first, identify which systems feed into which, and quarantine rather than delete suspected duplicates until provenance is confirmed. Deletion of what turns out to be the only surviving copy of a culturally significant image has happened before, and the damage is permanent.
The practical advice circulating in archival circles right now is straightforward: don't wait for a full system overhaul. Run a perceptual hash comparison on the highest-priority collections, flag the results for human review, and build the deduplication check into onboarding workflows for any new material. For institutions with collections in the tens of thousands of items, that process can now be completed in days rather than months using open-source tools that have matured significantly since 2020.
The next test will come when the State Library opens its expanded digitisation lab later in 2026. How it handles incoming files from day one will set the standard — or repeat the mistakes — that every smaller Victorian institution is watching closely.