Thousands of artwork records held across Melbourne's public institutions contain duplicate images — the same photograph of the same painting filed under two, three, sometimes four separate catalogue entries. The problem is not new, but it has become impossible to ignore now that major digitisation drives are pulling collections online for the first time.
The trigger is timing. The City of Melbourne and several state-funded galleries accelerated their digital access programs after 2021, when pandemic closures made physical viewing impossible for months on end. That rush to publish collections online exposed what cataloguers had quietly managed for decades on index cards and spreadsheets: a tangle of redundant image files, mismatched metadata, and records duplicated across successive IT system migrations.
How the Backlog Built Up
The roots go back to at least the mid-1990s, when institutions including the Ian Potter Centre: NGV Australia at Federation Square and the Counihan Gallery in Brunswick began their first digital cataloguing efforts using incompatible systems. When those systems were replaced — typically on 8-to-12 year cycles — records were often migrated without deduplication checks. An image captured in 1997 under one filename could arrive in a 2009 database as a fresh entry, with its original record still sitting in a legacy archive.
State Library Victoria on Swanston Street, which holds visual collections spanning more than 170 years, has publicly documented the scale of such challenges in its collection management planning documents. Libraries and galleries nationally have grappled with what archivists call "accumulation drift" — each digitisation sprint adds volume faster than quality control processes can keep pace.
Council-funded community arts programs compounded the issue. Brunswick's arts precinct, Footscray's West Space, and various inner-north neighbourhood houses each received digitisation grants at different times through Creative Victoria funding rounds. Because those projects used different contractors and different metadata standards, the resulting image files were rarely interoperable when they were eventually aggregated into broader state collections.
What Deduplication Actually Involves
Fixing duplicate image records is not simply a matter of deleting a file. Each entry may carry different provenance notes, donor information, condition reports, or rights statements. Archivists must compare records field by field before any merge or deletion, and that work is labour-intensive. A single collection of 10,000 objects can require six to twelve months of review at standard staffing levels, according to published guidelines from the Collections Council of Australia, which ceased operating in 2010 but whose technical standards remain a reference point for the sector.
Software tools have improved the process. Platforms like MuseumPlus and EMu, both in use at major Victorian institutions, now include fuzzy-match algorithms that flag probable duplicates for human review. But the software flags, not decides. A curator still needs to sign off on each resolution. At current staffing levels across Victoria's roughly 140 publicly funded collecting institutions, that backlog will not clear itself quickly.
The financial dimension matters here. Creative Victoria's 2024-25 budget allocated funding across collection digitisation and access initiatives, and institutions have been under pressure to demonstrate public-access outcomes from that spending. Publishing duplicate-riddled records online undermines those outcomes — a collection that appears to hold 25,000 works may, after deduplication, prove to hold closer to 18,000 distinct objects.
For Melbourne institutions, the practical path forward involves three steps that sector bodies have been pushing for several years. First, adopt a single shared metadata standard — Dublin Core remains the baseline but SPECTRUM 5.0 is increasingly the benchmark for Australian collecting institutions. Second, run deduplication audits before, not after, any new public-facing database launch. Third, seek recurrent rather than project-based funding for collection management, so the work is not abandoned the moment a grant period ends.
The NGV's ongoing collection redevelopment project, tied to the broader NGV Contemporary build at the former World Trade Centre site on Flinders Street, has given that institution an opportunity to address its own duplicate records as part of a wholesale system upgrade. Smaller organisations in Fitzroy, Collingwood, and Footscray are watching closely — they face the same problem with a fraction of the budget.