Melbourne's public institutions are sitting on a growing administrative headache. Duplicate images — the same photograph, artwork scan, or planning document stored under multiple file names across disconnected systems — are cluttering digital archives at the City of Melbourne, the State Library of Victoria, and dozens of smaller councils, costing storage budgets and undermining search reliability. The problem is not new, but pressure to fix it has sharpened in 2026 as heritage digitisation projects accelerate and public records laws tighten.
The timing matters. Victoria's amended Public Records Act obligations, which came into full effect for local government bodies in early 2026, require councils to demonstrate that digital holdings are de-duplicated, correctly licensed, and retrievable within defined timeframes. For institutions that have spent the past decade digitising physical collections at pace — the State Library of Victoria alone has added hundreds of thousands of items to its online catalogue since 2020 — the administrative backlog is now visible in audit reports and internal reviews.
What Melbourne is actually doing
The City of Melbourne's Digital Services team has been piloting a perceptual hashing tool since March 2026, applied first to the council's open data image repository on data.melbourne.vic.gov.au. The technology compares pixel-level fingerprints across files rather than relying on metadata, catching duplicates that were saved under different file names, resolutions, or compression settings. A second phase, expected to roll out to the council's Swanston Street heritage photo archive by September 2026, will extend the same process to roughly 40,000 legacy TIFF files.
In Carlton, the Software for Good collective — a not-for-profit digital consultancy based on Lygon Street — has been advising three inner-north councils on open-source de-duplication pipelines that avoid the licensing fees attached to enterprise tools from vendors like OpenText or IBM. The organisation argues that smaller Victorian councils cannot justify five-figure annual software contracts for what is fundamentally a batch-processing problem solvable with existing Python libraries.
Meanwhile the State Library of Victoria, on La Trobe Street in the CBD, is running a separate audit of its Trove-connected image holdings, coordinating with the National Library of Australia in Canberra after it emerged that some digitised Victorian newspaper front pages had been ingested twice through separate batch uploads in 2021 and 2023.
How that compares globally
Amsterdam's Stadsarchief — the city's municipal archive — completed a system-wide de-duplication project in late 2024, removing more than 1.2 million redundant image files from its 15-million-item digital collection, according to a case study published by the International Council on Archives in January 2025. The project took 18 months and cost approximately €340,000, a figure the archive's published report attributed partly to the need for manual review of borderline matches. London's Wellcome Collection finished a comparable exercise in 2023, flagging that around 8 per cent of its publicly accessible image records carried at least one exact duplicate.
Melbourne's effort is smaller in scale but arguably more fragmented. Unlike Amsterdam, which centralised its archive governance under a single municipal authority, Victoria distributes records responsibility across 79 councils, state agencies, and statutory bodies, each with its own IT procurement cycle. That structure means no single project can claim to have solved the problem city-wide. The City of Melbourne's pilot covers its own holdings; it has no jurisdiction over, say, the Melbourne Museum's image database on Nicholson Street in Carlton, which is managed by Museums Victoria under a separate government funding framework.
Singapore's National Archives completed an AI-assisted audit in 2025 that identified duplicates across colonial-era photographic holdings in under six weeks — a timeline Melbourne's practitioners say would be impossible here without first standardising metadata fields across institutions, a step that has been discussed at the Public Record Office Victoria for at least three years without a mandated deadline.
For Melburnians who use public digital archives — genealogists searching the Births, Deaths and Marriages index, architects pulling heritage overlays from council portals, journalists filing freedom of information requests — the practical upshot is that search results will remain noisier than they should be until at least mid-2027, when the City of Melbourne's second audit phase is expected to conclude. In the meantime, the advice from digital archivists is direct: if you retrieve a document from a council portal and the file name includes a string like "_v2" or "_FINAL_FINAL", assume there is an earlier version sitting in the same system, indexed separately, and request it explicitly.