100 leading AI scientists map route to more ‘trustworthy, reliable, secure’ AI

The debate over the risks and harms of artificial intelligence often focuses on what governments can or should do. However, just as important are the choices that AI researchers themselves make.Â
This week, in Singapore, more than 100 scientists from around the world proposed guidelines for how researchers should approach making AI more “trustworthy, reliable, and secure.”
Also: A few secretive AI companies could crush free society, researchers warn
The recommendations come at a time when the giants of generative AI, such as OpenAI and Google, have increasingly reduced disclosures about their AI models, so the public knows less and less about how the work is conducted.
The guidelines grew out of an exchange among the scholars last month in Singapore, in conjunction with one of the most prestigious conferences on AI, the International Conference on Learning Representations — the first time a major AI conference has taken place in Asia.Â
The document, “The Singapore Consensus on Global AI Safety Research Priorities,” is posted at the website of the Singapore Conference on AI, a second AI conference taking place this week in Singapore.
Among the luminaries who helped to draft the Singapore Consensus are Yoshua Bengio, founder of Canada’s AI institute, MILA; Stuart Russell, U.C. Berkeley distinguished professor of computer science, and an expert on “human-centered AI”; Max Tegmark, head of the UK-based think tank The Future of Life Institute; and representatives from the Massachusetts Institute of Technology, Google’s DeepMind unit, Microsoft, the National University of Singapore, and China’s Tsinghua University and National Academy of Sciences, among others.
To make the case that research must have guidelines, Singapore’s Minister for Digital Development and Information, Josephine Teo, in presenting the work, noted that people can’t vote for what kind of AI they want.
“In democracies, general elections are a way for citizens to choose the party that forms the government and to make decisions on their behalf,” said Teo. “But in AI development, citizens do not get to make a similar choice. However democratising we say the technology is, citizens will be at the receiving end of AI’s opportunities and challenges, without much say over who shapes its trajectory.”
Also: Google’s Gemini continues the dangerous obfuscation of AI technology
The paper lays out three categories researchers should consider: How to identify risks, how to build AI systems in such a way as to avoid risks, and how to maintain control over AI systems, meaning, ways to monitor and intervene in the case of concerns about those AI systems.
“Our goal is to enable more impactful R&D efforts to rapidly develop safety and evaluation mechanisms and foster a trusted ecosystem where AI is harnessed for the public good,” the authors write in the preface to the report. “The motivation is clear: no organisation or country benefits when AI incidents occur or malicious actors are enabled, as the resulting harm would damage everyone collectively.”
On the first score, assessing potential risks, the scholars advised the development of “metrology,” the measurement of potential harm. They write that there is a need for “quantitative risk assessment tailored to AI systems to reduce uncertainty and the need for large safety margins.”
There’s a need to allow outside parties to monitor AI research and development for risk, the scholars note, with a balance on protecting corporate IP. That includes developing “secure infrastructure that enables thorough evaluation while protecting intellectual property, including preventing model theft.”Â
Also:Â Stuart Russell: Will we choose the right objective for AI before it destroys us all?
The development section concerns how to make AI trustworthy, reliable, and secure “by design.” To do so, there’s a need to develop “technical methods” that can specify what’s intended from an AI program and also outline what should not happen — the “undesired side effects” — the scholars write.Â
The actual training of neural nets then needs to be advanced in such a way that the resulting AI programs are “guaranteed to meet their specifications,” they write. That includes parts of training that focus on, for example, “reducing confabulation” (often known as hallucinations) and “increasing robustness against tampering,” such as cracking an LLM with malicious prompts.
Last, the control section of the paper covers both how to extend current computer security measures and how to develop new techniques to avoid runaway AI. For example, conventional computer control, such as off-switches and override protocols, needs to be extended to handle AI programs. Scientists also need to design “new techniques for controlling very powerful AI systems that may actively undermine attempts to control them.”
The paper is ambitious, which is appropriate given rising concern about the risk from AI as it connects to more and more computer systems, such as agentic AI.Â
Also: Multimodal AI poses new safety risks, creates CSEM and weapons info
As the scientists acknowledge in the introduction, research on safety won’t be able to keep up with the rapid pace of AI unless more investment is made.
“Given that the state of science today for building trustworthy AI does not fully cover all risks, accelerated investment in research is required to keep pace with commercially driven growth in system capabilities,” write the authors.Â
Writing in Time magazine, Bengio echoes the concerns about runaway AI systems. “Recent scientific evidence also demonstrates that, as highly capable systems become increasingly autonomous AI agents, they tend to display goals that were not programmed explicitly and are not necessarily aligned with human interests,” writes Bengio.Â
“I’m genuinely unsettled by the behavior unrestrained AI is already demonstrating, in particular self-preservation and deception.”
Want more stories about AI? Sign up for Innovation, our weekly newsletter.