The US National Science Foundation is making a major new effort to improve academic research into artificial intelligence, pulling major technology companies into a partnership that it hopes will soften their longstanding reluctance to sharing their vast and valuable datasets.
The project, known as the National Artificial Intelligence Research Resource (NAIRR), is the opening stage of the NSF’s effort to comply with President Joe Biden’s call last October for an aggressive government-led campaign to figure out ways the nation can fully exploit the astounding potential of AI while guarding against its staggering risks.
The NAIRR initiative also comes amid a history of antagonisms between major companies such as Facebook and Google – which have built globally dominant multibillion-dollar industries based on their massive collections of detailed data on individuals and their thoughts and behaviours – and the academic researchers who want access to that information to precisely identify new opportunities and threats.
Among some recent examples, the social media platform long known as Twitter sued a group of independent researchers who published an investigation into its tolerance of hate speech. Facebook disabled all accounts associated with New York University’s Ad Observatory Project, including faculty, after it persisted with automated attempts to monitor the site. And one of the nation’s top media disinformation experts, Joan Donovan, accused Harvard University of ousting her to please Facebook after its owner offered the Ivy League institution a $500 million (£400 million) donation.
Such instances of corporate obstruction have been accompanied by a drumbeat of expert warnings about AI’s potential to bring the world untold magnitudes of benefit and danger. That has left Mr Biden determined to quickly boost cooperation on research into AI. “We need to govern this technology,” the president said in ordering a government-wide response in October. “There’s no other way around it.”
As the US government’s chief funder of academic research outside medical fields, the NSF was tasked under Mr Biden’s plan to create NAIRR and assemble an array of academic and private-sector partners for it. Those agreeing to join include industry titans Google, Facebook, Microsoft, Intel and IBM.
At its initial stage, however, NAIRR participation does not require – and the NSF did not request – that the companies make any new commitments on data sharing.
NSF officials said that is largely because they have just begun establishing the outlines of NAIRR and did not feel it was necessary or even wise to jump immediately into debates over controversial details of data sharing. Instead, NSF officials said in interviews, they hope that the NAIRR framework will bring corporate and academic partners together in a way that gradually builds up their trust to the point where such questions become easier to solve.
“That is going to be a continuing area” of attention for NAIRR, said one NSF official. “It’s absolutely key to make sure that researchers are getting access to the datasets that can serve their work.”
Some researchers involved in such work said they were eager for better access to corporate datasets but understood the NSF’s position at this point.
One of them, Laura Edelson, an assistant professor of computer sciences at Northeastern University, was at NYU in 2021 when she was among the researchers penalised by Facebook over the Ad Observatory Project. She has been pushing for federal legislation that would require such companies to share more of their data, but she said she recognised the caution that the NSF must show in matters that could anger the even-more-partisan assembly of lawmakers who vote on its budget.
“In an ideal world, this wouldn’t be a politicised issue,” Professor Edelson said. “But as to who should push back – the NSF really can’t.”
Deen Freelon, a professor of communication at the University of Pennsylvania, said he, too, sees that a firm demand of data access at this stage of the industry-university collaboration could short-circuit NAIRR’s chances of long-term success. “I don’t doubt that a hard line on data sharing might have resulted in the companies walking away,” he said.
And the urgency of any such action in the US might be getting diminished either way, said Kate Cell, the senior climate campaign manager at the Union of Concerned Scientists. That’s because the European Union has a new policy that requires major online companies to make data available to outside researchers, Ms Cell said. The companies, therefore, hopefully will “have to do this anyway”, she said.
The expansion of data access beyond the companies is critical, said Brandi Geurkink, executive director of the Coalition for Independent Technology Research, which represents academics and others defending the right to study the effects of technology on society.
“Tech corporations have time and time again either resisted or sabotaged voluntary efforts to provide researchers with data,” Ms Geurkink said.