arxiv:2502.16457

Towards Fully-Automated Materials Discovery via Large-Scale Synthesis Dataset and Expert-Level LLM-as-a-Judge

Published on Feb 23

· Submitted by

heegyu on Feb 24

Upvote

Authors:

Heegyu Kim ,

Abstract

Materials synthesis is vital for innovations such as energy storage, catalysis, electronics, and biomedical devices. Yet, the process relies heavily on empirical, trial-and-error methods guided by expert intuition. Our work aims to support the materials science community by providing a practical, data-driven resource. We have curated a comprehensive dataset of 17K expert-verified synthesis recipes from open-access literature, which forms the basis of our newly developed benchmark, AlchemyBench. AlchemyBench offers an end-to-end framework that supports research in large language models applied to synthesis prediction. It encompasses key tasks, including raw materials and equipment prediction, synthesis procedure generation, and characterization outcome forecasting. We propose an LLM-as-a-Judge framework that leverages large language models for automated evaluation, demonstrating strong statistical agreement with expert assessments. Overall, our contributions offer a supportive foundation for exploring the capabilities of LLMs in predicting and guiding materials synthesis, ultimately paving the way for more efficient experimental design and accelerated innovation in materials science.

View arXiv page View PDF Add to collection

Community

heegyu

Paper author Paper submitter about 21 hours ago

Towards Fully-Automated Materials Discovery: A New Era in Materials Science

We are thrilled to share our latest research, "Towards Fully-Automated Materials Discovery via Large-Scale Synthesis Dataset and Expert-Level LLM-as-a-Judge", which represents a significant step forward in the field of materials science.

The Challenge

Materials synthesis is the backbone of innovations in energy storage, catalysis, electronics, and biomedical devices. However, the traditional trial-and-error approach is time-consuming and heavily reliant on expert intuition. This inefficiency has long hindered progress in the field.

Our Solution

We introduce Open Materials Guide (OMG) to address these challenges, a dataset comprising 17,000+ high-quality, expert-verified synthesis recipes extracted from open-access literature. Building on this dataset, we developed AlchemyBench, the first end-to-end benchmark to evaluate machine learning models for materials synthesis prediction.

Key features of AlchemyBench include:

Raw Materials & Equipment Prediction: Models predict essential components for synthesis.
Synthesis Procedure Generation: Automated generation of step-by-step synthesis workflows.
Characterization Outcome Forecasting: Predicting experimental results with precision.

A Breakthrough Framework: LLM-as-a-Judge

Our research also introduces the LLM-as-a-Judge framework, leveraging large language models (LLMs) to assess synthesis predictions. This framework demonstrates strong alignment with expert evaluations, offering a scalable alternative to costly human assessments.

Why It Matters

Our contributions pave the way for:

Accelerated experimental design.
Enhanced reproducibility in materials research.
A deeper understanding of how AI can transform scientific discovery.

Open Access for Collaboration

To foster collaboration and innovation, we have made our dataset and code openly available to the research community. You can explore them here: GitHub Repository.

Join Us in Shaping the Future

This work marks a pivotal moment in materials science, blending AI and data-driven approaches to revolutionize how we discover and synthesize new materials. We invite researchers, scientists, and enthusiasts to join us in exploring the potential of fully automated materials discovery.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Upvote

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2502.16457 in a model README.md to link it from this page.

Datasets citing this paper 1

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2502.16457 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.