Breakthrough suggests that the technology underlying ChatGPT and Bard can generate information that is superior to human understanding.
Researchers in artificial intelligence claim to have generated the world’s first scientific discovery using a huge language model, a development that shows the technology underpinning ChatGPT and related programs can generate information that is beyond human comprehension.
The discovery came from Google DeepMind, where researchers are looking at whether big language models, which power modern chatbots like OpenAI’s ChatGPT and Google’s Bard, can do more than just repackage information learnt in training and provide new insights.
“There was no indication when we started the project that it would produce something genuinely new,” said Pushmeet Kohli, DeepMind’s head of AI for science. “As far as we know, this is the first time that a genuine, new scientific discovery has been made by a large language model.”
LLMs, or large language models, are strong neural networks that learn language patterns, including computer code, from massive volumes of text and other data. ChatGPT has debugged faulty software and cranked out everything from college essays and trip itineraries to Shakespeare-style poems about climate change since its hurried launch last year.
However, while chatbots are highly popular, they do not develop new information and are prone to confabulation, resulting in answers that, like the finest pub bores, are fluent and believable but severely flawed.
DeepMind used an LLM to write solutions to questions in the form of computer programs to create “FunSearch,” which stands for “searching in the function space.” The LLM is combined with a “evaluator” that ranks the programs based on how well they function. The top programs are then pooled and sent to the LLM for improvement. This causes the system to gradually develop weak programs into more powerful ones capable of discovering new knowledge.
The researchers assigned FunSearch to two puzzles. The first was the cap set problem, a long-standing and somewhat obscure pure mathematics challenge. It is concerned with locating the greatest number of points in space when no three points make a straight line. FunSearch created programs that produce new large cap sets that outperform anything mathematicians could come up with.
The second issue was the bin packing problem, which involved determining the optimum way to put goods of various sizes into containers. While the same logic applies to physical things, such as the most efficient method to organize boxes in a shipping container, it also applies to other domains, such as scheduling computing operations in datacenters. Typically, the difficulty is solved by stuffing objects into the first accessible bin, or into the bin with the least available space where the item will still fit. According to findings published in Nature, FunSearch discovered a superior technique that avoided leaving small gaps that were unlikely to be filled.
“In the last two or three years, there have been some exciting examples of human mathematicians collaborating with AI to obtain advances on unsolved problems,” said Sir Tim Gowers, a Cambridge University professor of mathematics who was not involved in the research. “This work could provide us with another very interesting tool for such collaborations, allowing mathematicians to efficiently search for clever and unexpected constructions.” Even better, these constructs are interpretable by humans.”
Researchers are now investigating the variety of scientific challenges that FunSearch can address. A fundamental constraint is that the problems must have solutions that can be verified automatically, which eliminates out many questions in biology, where theories are frequently validated using lab tests.
Computer programmers may feel the most direct impact. Over the last 50 years, coding has greatly improved due to humans developing increasingly specialized algorithms. “This is actually going to be transformational in how people approach computer science and algorithmic discovery,” Kohli said in a statement. “For the first time, we’re seeing LLMs not taking over, but definitely assisting in pushing the boundaries of what is possible in algorithms.”
“What I find really exciting, even more than the specific results we found, is the prospects it suggests for the future of human-machine interaction in math,” said Jordan Ellenberg, professor of mathematics at the University of Wisconsin-Madison and co-author on the article.
“Rather than producing a solution, FunSearch produces a program that finds the solution.” A solution to one problem may not provide me with any insight on how to solve other related difficulties. A program that finds the solution, on the other hand, is something a human can understand and comprehend, hopefully generating ideas for the next challenge and the next and the next.”