Skip to main content
The Daily Melbourne

Melbourne news, every day

News

Melbourne Councils and Archives Race to Fix Duplicate Image Problem Swamping Digital Collections

A quiet but costly data headache has pushed several Melbourne institutions to overhaul how they manage digitised photo archives this week.

By Melbourne News Desk · Published 5 July 2026, 6:02 am

4 min read

Melbourne Councils and Archives Race to Fix Duplicate Image Problem Swamping Digital Collections
Photo: Various / Public domain (Wikimedia Commons)

Victoria's public archives and local government digital teams are grappling with a surge in duplicate image files clogging their online collections, with at least three Melbourne institutions confirming this week they have launched dedicated clean-up programs to address the problem before it compounds storage and search costs further.

The issue matters now because several councils and collecting institutions have spent the past three years accelerating digitisation projects — converting physical photos, maps and documents to online formats — partly under the Victorian Government's Public Record Office Victoria digitisation framework. That rapid ingestion of material, often from multiple scanning batches of the same original items, has left collections riddled with near-identical image files that inflate storage bills, confuse researchers and degrade search results for anyone trying to access records through public portals.

Which Melbourne Institutions Are Affected

The City of Melbourne's digital collections team, based at the Library at The Dock in Docklands, flagged the duplicate problem in an internal review completed in late June. The library holds more than 180,000 digitised items, and staff found a measurable share of image assets had been ingested more than once during a 2023-24 scanning push that processed the Hodgkinson and Lyle photographic collections. The team is now running automated deduplication software across the full repository, a process expected to take until late August.

The Victorian Archives Centre on Macarthur Street in East Melbourne is dealing with a related but distinct version of the problem. Archivists there have identified cases where high-resolution master files and lower-resolution access copies were catalogued as separate items rather than as versions of the same record, effectively doubling apparent collection size in the public search interface. Staff are working through roughly 40,000 records flagged by an automated audit tool introduced in May.

Moreland — now operating under its renamed structure as Merri-bek City Council — also confirmed this week that its local history digitisation project, centred on the Coburg Library Local History Collection, encountered duplicate entries when images from two separate volunteer scanning programs were merged into a single database last year. Council is working with software from the open-source CollectiveAccess platform to reconcile those records.

What the Duplication Actually Costs

Cloud storage is not cheap at institutional scale. Microsoft Azure pricing for archive-tier blob storage — the type commonly used for large image repositories — sits around AUD $2.50 to $3.20 per terabyte per month depending on configuration, meaning even a few extra terabytes of redundant image files adds up across a financial year. For a mid-sized council holding several terabytes of digitised assets, unmanaged duplication can add thousands of dollars annually to storage contracts without providing any public benefit.

Beyond storage, duplicates distort how collections are counted for funding applications. Grant programs administered through Creative Victoria and the federal government's GLAM (galleries, libraries, archives and museums) sector funding rounds often ask institutions to report on collection size and access statistics. Inflated item counts from duplicates can create compliance headaches when those numbers are later audited or compared with access logs.

Public Record Office Victoria issued revised guidance in March on file naming conventions and ingest procedures designed to prevent new duplicates entering agency collections. Whether that guidance is being consistently applied across the dozens of councils and statutory bodies it covers remains an open question — several smaller councils contacted for this story had not yet received formal training on the updated protocols.

Institutions currently working through their duplicate backlogs say the practical fix is a two-step process: run a perceptual hashing tool across the image library to surface likely matches, then have a human reviewer confirm before deletion. The City of Melbourne team is using an open-source tool called DuplicateDetector alongside manual spot-checking. Merri-bek is pairing CollectiveAccess's built-in deduplication module with volunteer help from the Friends of Coburg Library group, who have contributed archival knowledge to distinguish genuinely different images that look similar.

The clean-up programs at all three institutions are expected to wrap up before the end of the 2026 calendar year, after which staff plan to publish revised collection counts and update their online finding aids. Researchers using the Public Record Office Victoria search portal or council history databases should expect some metadata to shift as corrections are applied over coming months.

Partner Content

Sponsored

Tell Melbourne your story

Partner Content lets Melbourne businesses reach engaged local readers with a clearly labelled, editorial-style feature. Every placement is marked Sponsored, in line with our sponsored content policy.

Spread the word

Have your say

Loading comments…

Sources

About this article

Published by The Daily Melbourne

This article was produced by the The Daily Melbourne editorial desk and covers news in Melbourne. See our editorial standards for how we use AI.

The Daily Melbourne brief

The day's Melbourne news in a 2-minute read, every weekday morning. Free.

By subscribing you agree to receive emails from The Daily Melbourne and accept our Privacy Policy. Unsubscribe anytime.

See something wrong? Suggest a correction.

Daily brief

Enjoyed this? Wake up to Melbourne news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily Melbourne and accept our Privacy Policy. Unsubscribe anytime.

You might also like

Free daily briefing

Enjoyed this story? Get tomorrow's briefing free.

The day's Melbourne news in a 2-minute read, every weekday morning. Free.

Subscribing to melbourne morning briefing.

The Daily Network

More from around Australia

View the whole network