It is time for open access to move on from institutional repositories

The British Library cyber-attack underlines that HE and research libraries’ technologies and policies put us at too much risk, says Fiona Greig

January 18, 2024
Concept of person with key and enter sign at The British Library, national library of the United Kingdom to illustrate It is time for open access to move on from institutional repositories
Source: Getty Images/Istock montage

I know that this is not going to be a universally popular view across the sector, but stick with me.

The situation at the British Library is a real dagger to the heart for those of us who support open access. Many of the library’s digital services have been offline for weeks following November’s devastating cyber-attack. Particularly galling is the ongoing inaccessibility of the EThOS archive of 600,000 doctoral dissertations.

EThOS was a catalyst for change when it was launched in 2009, unlocking a significant part of the “research journey” that had been hidden. This landmark service showcases the UK as a nation rich with exceptional thinkers and dedicated to opening knowledge to everyone.

It also demonstrates the importance of national leadership on big digital infrastructure projects. EThOS was the first central UK open initiative that caught the public imagination, and it made an amazing difference to scholars.

We don’t know yet if the actual works have been stolen or “just” the user data. Luckily, the material is electronically available in multiple locations, so the database could be reassembled. Still, what the attack underlines is that higher education and research libraries’ technologies and policies place us at more risk than we can afford.

The “bad guys” are, of course, after IP from universities in general, but what sells best is individual or financial information. Our requirement that people set up accounts before they can use services like EThOS or university repositories means we have tens/hundreds of thousands of “people” records, with registered email addresses and passwords that customers are likely to have used elsewhere, too. So we find ourselves the accidental custodians of extremely valuable information – but we have never really managed it (Shush! Don’t tell anyone!).

Why do we ask for all this? Academics tell us they must know who is using their materials, yet very few look at the unaggregated data. With EThOS, registration is required to digitise and print a thesis, but, again, very few individual citizens request this service: it tends to be institutions who fund digitisations.

Will the sale of our user data on the dark web stop people accessing our open access materials because they have lost trust in us? Have we opened up our institutions to “interest” from the Information Commissioner’s Office? For me, those are the greatest potential impacts of this security breach.

As well as collecting information we don’t need, the technologies used to deliver most UK open access repositories are still running on open source and community developed/supported tools. That is because they date from 15-25 years ago, when the government (via Jisc) put millions of pounds into establishing them. Subsequently, however, institutions have been left to develop, maintain and secure those tools alone. But the “bright minds” who created them have long since moved on, and newly-minted developers are uninterested in learning out-of-date skills. Moreover, I can tell you, as someone who straddles both library and IT, that updating these tools is absent from institutional priority lists and investment strategies.

At the British Library itself, however, there has been investment and technology change, so such a devastating cyber-attack must be a wake-up call to the whole sector and our funders. We are now understood among cyber-criminals to be very soft targets with a lot of profitable information basically just lying around.

It is notable that Richard Poynder, who has advocated the “idealist”, non-commercial view of open access, has recently said the movement has failed because it hasn’t solved issues around affordability and equity. I have always been a pragmatist: I am fully committed to getting our material “out there” but have questioned the way it has been approached technologically and the institutional risks being carried. I am not going to go into details here; I really don’t want to be the one opening the already unlocked back door. But the risks of unsupported and obsolete operating systems and databases are manifold in this sector.

OA is not dead: we can’t allow it to be! But we may need to accept that the golden age of open software, built around institutional cottage industries, is over. And that may prompt us to revisit the underpinning rationale.

UK Research and Innovation keeps talking about creating “transformational infrastructure” to reshape the research journey, including making research open. Why not be brave? Why not move from institutional repositories to a UK knowledge store, developed – Shock! Horror! – with a commercial partner, using Software as a Service (SaaS) tools built to maximise the power of modern technologies and hosted in safe and secure data centres?

Jisc has tried in the past few years to deliver “next-generation repositories, research data repositories and digital archiving”, but, unfortunately, it has failed. It is too hard and too expensive to deliver the existing vision and OA mindset.

If the cyber-attack on the British Library has inadvertently allowed the institution to perform its principal duty and lead the sector to a fundamental shift then something truly positive will have come out of this awful situation.

Fiona Greig is director of knowledge and digital services at the University of Winchester. She writes in a personal capacity.

Register to continue

Why register?

  • Registration is free and only takes a moment
  • Once registered, you can read 3 articles a month
  • Sign up for our newsletter
Register
Please Login or Register to read this article.

Related articles

Reader's comments (4)

This is not a surprise. You would not ask an amateur electrician to rewire your library electricals. Nor would you ask an amateur builder to repair your roof. So why did you expect someone who writes programs as a hobby at home to build secure, reliable systems. The fact that some open source communities have built widely used tools, is no guarantee that the next tool will be secure and robust. Yet like so many other you [the BL] seem to have bought into a mantra that OS is "good" and because its only software it could be made safely. Commercial software is not defect free however if you have a contact in place you at least investigate and stipulate requirements including security, robustness could at least stipulate
You mean like the Post Office's contract with Fujitsu? Um, yeah... no.
The majority of open source developers are not hobbyists. Most of your servers and many research computing systems attest to that. From the description, the failure here is one of investment: JISC put in some money and then expected that universities would pay for ongoing development but they did not. Following the example above, I submit that that is like paying the deposit to the builders and expecting them to finish the rest of the job for free. Or, more relevantly given the suggestions here, assuming that your commercial software supplier will only take your money once and then offer free updates for life. Of course FOSS doesn’t solve every problem but the idea that a commercial tool is de facto better because it is paid for is ridiculous: look no further than many of the tools you use every day.
The central argument of this article is entirely flawed. The proposed solution, a SaaS architecture developed by a commercial partner as a non-open source solution would be an expensive endeavour that would undo a 2-decade-long progress in this space. The key issue is not the infrastructure of repositories, it is not open source, it is not that repositories cannot run on new up-to-date software and use new tools. The whole issue is about the insufficient investment that goes into open scholarly infrastructures. Infrastructure developed on open source can be significantly cheaper, more secure, robust, reliable and appealing, but one cannot expect that it is entirely free. We should be asking ourselves the question of why people like Fiona Greig, the author of this post, argue for something so entirely ill-informed instead of bringing into light that the spending on open scholarly infrastructure doesn't account even for 10% of library budgets.

Sponsored