Skip to main content
The Daily Melbourne

Melbourne news, every day

News

Melbourne's digital archivists tackle duplicate image crisis as week brings fresh urgency to collection cleanup

Libraries, councils and cultural institutions across the city are confronting a growing backlog of duplicate digital images — and a new push to fix it landed this week.

By Melbourne News Desk · Published 5 July 2026, 5:06 am

4 min read

Melbourne's digital archivists tackle duplicate image crisis as week brings fresh urgency to collection cleanup
Photo: Photo by Costa Karabelas on Pexels

State Library Victoria confirmed this week it is accelerating a structured audit of its digitised photographic holdings after an internal review identified tens of thousands of duplicate image files across its online catalogue — redundant scans, re-uploaded surrogates and near-identical derivatives that have compounded over more than a decade of digitisation projects. The audit, which the library says will run through to the end of 2026, targets collections held across its Swanston Street reading rooms and its off-site storage at Ballarat Road, Maidstone.

The issue matters now because Victoria is deep in a broader push to make its cultural heritage collections genuinely searchable and accessible online. Duplicate records don't just waste server space — they fragment discovery, return garbled search results and create real confusion for researchers trying to establish which version of an image is authoritative. With the state government having committed funding to expand the Library's digital infrastructure as part of Creative Victoria's 2025–2028 cultural investment framework, getting the underlying data right has become a precondition for further development.

From Swanston Street to Southbank: Who's cleaning up

State Library Victoria is not alone. The City of Melbourne's own digital archives team, based at the Melbourne Town Hall annex on Little Collins Street, has been running what it internally calls a "de-duplication sprint" since June 2, targeting images held in its heritage photographic register — a collection that spans planning records, infrastructure photography and community documentation accumulated since the early 2000s. Council's digital records unit has flagged the problem as partly a legacy of COVID-era digitisation, when remote working meant multiple staff sometimes scanned and uploaded the same physical items independently.

The Australian Centre for the Moving Image on Federation Square has faced a related but distinct version of the problem. ACMI's collection management team has been reconciling still-image stills extracted from digitised film reels — frames that automated systems flagged as unique but that turn out to be near-duplicates from adjacent moments in a sequence. The centre declined to provide specifics on the scale of its backlog, but its collection database, built on the open-source Axiell EMu platform, has been updated with new similarity-detection parameters as of this financial year.

Why duplicates pile up — and what fixing them costs

Digital preservation specialists point to a predictable pattern. Institutions digitise collections in project-based bursts, often under grant funding with fixed end dates, then move on without systematic reconciliation of what was already online. A 2024 survey by the Digital Preservation Coalition, which has member institutions in Australia including the National Library, found that duplicate or near-duplicate records accounted for between 8 and 22 percent of holdings across mid-sized cultural collections — a range wide enough to suggest the problem is common but poorly measured.

Fixing it is not cheap. Software tools for perceptual hashing — the standard technique for identifying visually similar images even when file names or metadata differ — typically cost between $8,000 and $40,000 annually for enterprise-grade licensing, depending on collection size. Open-source alternatives exist but require significant staff time to configure and maintain. For smaller institutions like the Footscray Community Arts Centre, which has been digitising its thirty-year photographic archive with support from the Maribyrnong City Council, the practical answer has been manual review by trained volunteers working through the collection file by file — slower, but less reliant on infrastructure budgets that don't exist.

The Royal Historical Society of Victoria, headquartered on William Street, announced on Tuesday that it would partner with the University of Melbourne's School of Computing and Information Systems to pilot an automated deduplication workflow on a subset of its glass plate negative scans. The pilot, involving roughly 4,000 images, is expected to run for three months and produce a publicly available methodology report by October 2026.

For researchers and members of the public who use these collections, the immediate practical advice is straightforward: when searching state or council image catalogues online, cross-check accession numbers rather than relying on thumbnail previews alone, since duplicates often carry different identifiers. Institutions working through audits have also flagged that some records flagged as duplicates will ultimately be retained as deliberate preservation copies — so not every redundant-looking result is an error. The cleanup is ongoing, and the catalogues will remain imperfect through the second half of this year.

Partner Content

Sponsored

Tell Melbourne your story

Partner Content lets Melbourne businesses reach engaged local readers with a clearly labelled, editorial-style feature. Every placement is marked Sponsored, in line with our sponsored content policy.

Spread the word

Have your say

Loading comments…

Sources

About this article

Published by The Daily Melbourne

This article was produced by the The Daily Melbourne editorial desk and covers news in Melbourne. See our editorial standards for how we use AI.

The Daily Melbourne brief

The day's Melbourne news in a 2-minute read, every weekday morning. Free.

By subscribing you agree to receive emails from The Daily Melbourne and accept our Privacy Policy. Unsubscribe anytime.

See something wrong? Suggest a correction.

Daily brief

Enjoyed this? Wake up to Melbourne news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily Melbourne and accept our Privacy Policy. Unsubscribe anytime.

You might also like

Free daily briefing

Enjoyed this story? Get tomorrow's briefing free.

The day's Melbourne news in a 2-minute read, every weekday morning. Free.

Subscribing to melbourne morning briefing.

The Daily Network

More from around Australia

View the whole network