Retrieval of Synthesis Parameters of Polymer Nanocomposites Using LLMs
Abstract
Automated materials synthesis requires historical data, but extracting detailed data and metadata from publications is challenging. We developed initial strategies for using large language models for rapid, autonomous data extraction from materials science articles in a format curatable by a materials database. We used the sub-domain of polymer nanocomposites as our example use case and demonstrated a proof of concept case study via manual validation. We used Claude 2 chat, Open AI GPT-3.5, and 4 API to extract characterization methods and general information about the samples, utilizing zero and few-shot prompting to elicit more detailed and accurate responses. We achieved the best results with an F1 score of 0.88 in the sample extraction task, using Claude 2 chat. Our findings demonstrate the utility of language models for more effective and practical retrieval of synthesis parameters from literature.
Cite
Text
Circi et al. "Retrieval of Synthesis Parameters of Polymer Nanocomposites Using LLMs." NeurIPS 2023 Workshops: AI4Mat, 2023.Markdown
[Circi et al. "Retrieval of Synthesis Parameters of Polymer Nanocomposites Using LLMs." NeurIPS 2023 Workshops: AI4Mat, 2023.](https://mlanthology.org/neuripsw/2023/circi2023neuripsw-retrieval/)BibTeX
@inproceedings{circi2023neuripsw-retrieval,
title = {{Retrieval of Synthesis Parameters of Polymer Nanocomposites Using LLMs}},
author = {Circi, Defne and Khalighinejad, Ghazal and Badhwar, Shruti and Dhingra, Bhuwan and Brinson, L.},
booktitle = {NeurIPS 2023 Workshops: AI4Mat},
year = {2023},
url = {https://mlanthology.org/neuripsw/2023/circi2023neuripsw-retrieval/}
}