Network and data connection on a dark blue background.

An international team of scientists, including from the 探花直播 of Cambridge, have launched a new research collaboration that will leverage the same technology behind ChatGPT to build an AI-powered tool for scientific discovery.

While ChatGPT deals in words and sentences, the team鈥檚 AI will learn from numerical data and physics simulations from across scientific fields to aid scientists in modelling everything from supergiant stars to the Earth鈥檚 climate.

探花直播team launched the initiative, called earlier this week, alongside the publication of a series of on the arXiv.org open access repository.

鈥淭his will completely change how people use AI and machine learning in science,鈥 said Polymathic AI principal investigator Shirley Ho, a group leader at the Flatiron Institute鈥檚 Center for Computational Astrophysics in New York City.

探花直播idea behind Polymathic AI 鈥渋s similar to how it鈥檚 easier to learn a new language when you already know five languages,鈥 said Ho.

Starting with a large, pre-trained model, known as a foundation model, can be both faster and more accurate than building a scientific model from scratch. That can be true even if the training data isn鈥檛 obviously relevant to the problem at hand.

鈥淚t鈥檚 been difficult to carry out academic research on full-scale foundation models due to the scale of computing power required,鈥 said co-investigator Miles Cranmer, from Cambridge鈥檚 Department of Applied Mathematics and Theoretical Physics and Institute of Astronomy. 鈥淥ur collaboration with Simons Foundation has provided us with unique resources to start prototyping these models for use in basic science, which researchers around the world will be able to build from 鈥 it鈥檚 exciting.鈥

鈥淧olymathic AI can show us commonalities and connections between different fields that might have been missed,鈥 said co-investigator Siavash Golkar, a guest researcher at the Flatiron Institute鈥檚 Center for Computational Astrophysics. 鈥淚n previous centuries, some of the most influential scientists were polymaths with a wide-ranging grasp of different fields. This allowed them to see connections that helped them get inspiration for their work. With each scientific domain becoming more and more specialised, it is increasingly challenging to stay at the forefront of multiple fields. I think this is a place where AI can help us by aggregating information from many disciplines.鈥

探花直播Polymathic AI team includes researchers from the Simons Foundation and its Flatiron Institute, New York 探花直播, the 探花直播 of Cambridge, Princeton 探花直播 and the Lawrence Berkeley National Laboratory. 探花直播team includes experts in physics, astrophysics, mathematics, artificial intelligence and neuroscience.

Scientists have used AI tools before, but they鈥檝e primarily been purpose-built and trained using relevant data. 鈥淒espite rapid progress of machine learning in recent years in various scientific fields, in almost all cases, machine learning solutions are developed for specific use cases and trained on some very specific data,鈥 said co-investigator Francois Lanusse, a cosmologist at the Centre national de la recherche scientifique (CNRS) in France. 鈥淭his creates boundaries both within and between disciplines, meaning that scientists using AI for their research do not benefit from information that may exist, but in a different format, or in a different field entirely.鈥

Polymathic AI鈥檚 project will learn using data from diverse sources across physics and astrophysics (and eventually fields such as chemistry and genomics, its creators say) and apply that multidisciplinary savvy to a wide range of scientific problems. 探花直播project will 鈥渃onnect many seemingly disparate subfields into something greater than the sum of their parts,鈥 said project member Mariel Pettee, a postdoctoral researcher at Lawrence Berkeley National Laboratory.

鈥淗ow far we can make these jumps between disciplines is unclear,鈥 said Ho. 鈥淭hat鈥檚 what we want to do 鈥 to try and make it happen.鈥

ChatGPT has well-known limitations when it comes to accuracy (for instance, the chatbot says 2,023 times 1,234 is 2,497,582 rather than the correct answer of 2,496,382). Polymathic AI鈥檚 project will avoid many of those pitfalls, Ho said, by treating numbers as actual numbers, not just characters on the same level as letters and punctuation. 探花直播training data will also use real scientific datasets that capture the physics underlying the cosmos.

Transparency and openness are a big part of the project, Ho said. 鈥淲e want to make everything public. We want to democratise AI for science in such a way that, in a few years, we鈥檒l be able to serve a pre-trained model to the community that can help improve scientific analyses across a wide variety of problems and domains.鈥



探花直播text in this work is licensed under a . Images, including our videos, are Copyright 漏 探花直播 of Cambridge and licensors/contributors as identified.听 All rights reserved. We make our image and video content available in a number of ways 鈥 as here, on our main website under its Terms and conditions, and on a range of channels including social media that permit your use and sharing of our content under their respective Terms.