Technology

With GPT-4, OpenAI opts for secrecy versus disclosure


GPT-4 and OpenAI concept

Photo by Jakub Porzycki/NurPhoto via Getty Images

A common thread of artificial intelligence research is disclosing technical details of the software in research papers so that others in the field can understand and learn from the programs. 

That tradition crossed a threshold on Tuesday with the release of OpenAI’s GPT-4 program, the latest technology in a line of programs that form the heart of the wildly popular ChatGPT chatbot. 

Also: Want to experience GPT-4? Just use Bing Chat

In the GPT-4 technical report published Tuesday, alongside the blog post by OpenAI, the firm states that it is refraining from offering technical details because of competitive and safety considerations. 

“Given both the competitive landscape and the safety implications of large-scale models like GPT-4,” it writes, “This report contains no further details about the architecture (including model size), hardware, training compute, dataset construction, training method, or similar.”

The term “architecture” refers to the fundamental construction of an AI program, how its artificial neurons are arranged, and is the essential element of any AI program. The “size” of a program is how many neural “weights,” or, parameters, it makes use of, a key element distinguishing one program from another. 

Also: ChatGPT’s success could prompt a damaging swing to secrecy in AI, says AI pioneer

Without such details, the GPT-4 program is a complete enigma. The research paper in that sense discloses none of the research.

The paper offers only two sentences describing in very broad terms how the program is constructed. 

“GPT-4 is a Transformer-style model pre-trained to predict the next token in a document, using both publicly available data (such as internet data) and data licensed from third-party providers. The model was then fine-tuned using Reinforcement Learning from Human Feedback.” 

Neither sentence offers anything a casual observer wouldn’t have already figured about the program. 

Also: What is GPT-4? Here’s what you need to know

The lack of disclosure is a break with the habits of most researchers in AI. Other research labs often post not only detailed technical information but also source code, so that other researchers can duplicate their results.

The lack of disclosure is at odds, moreover, with even the limited disclosure habits of OpenAI. 

GPT-4, as its name suggests, is the fourth version of what’s known as a “generative pre-trained transformer,” a program built to manipulate human language. When the very first version of the program debuted in 2018, OpenAI did not offer source code. The company did, however, describe in detail how they composed the various working parts of GPT-1, its architecture.

That technical disclosure allowed many researchers to reason about the functioning of the program even if they couldn’t duplicate its construction.

GPT-1 system diagram

GPT-1 was explained in 2018 in a system diagram that let researchers understand key properties of the program. No such description appears in GPT-4’s technical paper, nor do any descriptive phrases give away its architecture or other key features.

OpenAI

With GPT-2, released on February 14 in 2019, OpenAI not only didn’t offer source code, it also restricted distribution of the finished program. The company emphasized that the program’s capabilities were too extreme to take the chance that releasing it would allow malicious parties to use the program for malignant ends. 

“Due to our concerns about malicious applications of the technology, we are not releasing the trained model,” said OpenAI. 

Also: These experts are racing to protect AI from hackers

While not releasing code or trained models, OpenAI researchers Alec Radford and team did describe, in somewhat less detail than the prior version, how they had modified the first GPT. 

In 2020, when OpenAI released the third GPT, Radford and team again withheld source code, and they did not provide any downloads of the program at all, instead resorting to a cloud service with a waitlist. They did so, they said, both to limit GPT-3’s use by bad actors and to make money by charging a fee for access.

Despite that restriction, OpenAI did provide a set of technical specifications that afforded others insight into how GPT-3 was a big advance over the prior two versions. 

Set against that track record, the GPT-4 paper marks a new milestone in lack of disclosure. The decision not only to withhold source code and withhold the finished program, but also to withhold technical details that would allow outside researchers to guess about the composition of the program, is a new kind of omission.

Also: How to use ChatGPT to build your resume

While lacking in technical detail, the GPT-4 paper, 98 pages long, is novel in a different way. It breaks ground in acknowledging the enormous resources marshaled to make the program operate. 

In place of the usual author citations on the front page, the technical report contains three pages of attribution at the end, citing hundreds of contributors, including everyone at OpenAI right down to the finance department:

“We also acknowledge and thank every OpenAI team member not explicitly mentioned above, including the amazing people on the executive assistant, finance, go to market, human resources, legal, operations and recruiting teams. From hiring everyone in the company, to making sure we have an amazing office space, to building the administrative, HR, legal, and financial structures that allow us to do our best work, everyone at OpenAI has contributed to GPT-4.”

Also: How to make ChatGPT provide sources and citations

There is a suggestion in the paper that OpenAI may offer further disclosure at some unspecified time, and are still perhaps committed to scientific advancement by transparency:

We are committed to independent auditing of our technologies, and shared some initial steps and ideas in this area in the system card accompanying this release. We plan to make further technical details available to additional third parties who can advise us on how to weigh the competitive and safety considerations above against the scientific value of further transparency.



Source link