Autonomous chemical research with large language models

  • Brown, T. et al. in Advances in Neural Information Processing Systems Vol. 33 (eds Larochelle, H. et al.) 1877–1901 (Curran Associates, 2020).

  • Thoppilan, R. et al. LaMDA: language models for dialog applications. Preprint at https://arxiv.org/abs/2201.08239 (2022).

  • Touvron, H. et al. LLaMA: open and efficient foundation language models. Preprint at https://arxiv.org/abs/2302.13971 (2023).

  • Hoffmann, J. et al. Training compute-optimal large language models. In Advances in Neural Information Processing Systems 30016–30030 (NeurIPS, 2022).

  • Chowdhery, A. et al. PaLM: scaling language modeling with pathways. J. Mach. Learn. Res. 24, 1–113 (2022).

  • Lin, Z. et al. Evolutionary-scale prediction of atomic-level protein structure with a language model. Science 379, 1123–1130 (2023).

    Article 
    ADS 
    MathSciNet 
    CAS 
    PubMed 

    Google Scholar
     

  • Luo, R. et al. BioGPT: generative pre-trained transformer for biomedical text generation and mining. Brief Bioinform. 23, bbac409 (2022).

    Article 
    PubMed 

    Google Scholar
     

  • Irwin, R., Dimitriadis, S., He, J. & Bjerrum, E. J. Chemformer: a pre-trained transformer for computational chemistry. Mach. Learn. Sci. Technol. 3, 015022 (2022).

    Article 
    ADS 

    Google Scholar
     

  • Kim, H., Na, J. & Lee, W. B. Generative chemical transformer: neural machine learning of molecular geometric structures from chemical language via attention. J. Chem. Inf. Model. 61, 5804–5814 (2021).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Jablonka, K. M., Schwaller, P., Ortega-Guerrero, A. & Smit, B. Leveraging large language models for predictive chemistry. Preprint at https://chemrxiv.org/engage/chemrxiv/article-details/652e50b98bab5d2055852dde (2023).

  • Xu, F. F., Alon, U., Neubig, G. & Hellendoorn, V. J. A systematic evaluation of large language models of code. In Proc. 6th ACM SIGPLAN International Symposium on Machine Programming 1–10 (ACM, 2022).

  • Nijkamp, E. et al. CodeGen: an open large language model for code with multi-turn program synthesis. In Proc. 11th International Conference on Learning Representations (ICLR, 2022).

  • Kaplan, J. et al. Scaling laws for neural language models. Preprint at https://arxiv.org/abs/2001.08361 (2020).

  • OpenAI. GPT-4 Technical Report (OpenAI, 2023).

  • Ziegler, D. M. et al. Fine-tuning language models from human preferences. Preprint at https://arxiv.org/abs/1909.08593 (2019).

  • Ouyang, L. et al. Training language models to follow instructions with human feedback. In Advances in Neural Information Processing Systems 27730–27744 (NeurIPS, 2022).

  • Granda, J. M., Donina, L., Dragone, V., Long, D.-L. & Cronin, L. Controlling an organic synthesis robot with machine learning to search for new reactivity. Nature 559, 377–381 (2018).

    Article 
    ADS 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Caramelli, D. et al. Discovering new chemistry with an autonomous robotic platform driven by a reactivity-seeking neural network. ACS Cent. Sci. 7, 1821–1830 (2021).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Angello, N. H. et al. Closed-loop optimization of general reaction conditions for heteroaryl Suzuki–Miyaura coupling. Science 378, 399–405 (2022).

    Article 
    ADS 
    MathSciNet 
    CAS 
    PubMed 

    Google Scholar
     

  • Adamo, A. et al. On-demand continuous-flow production of pharmaceuticals in a compact, reconfigurable system. Science 352, 61–67 (2016).

    Article 
    ADS 
    CAS 
    PubMed 

    Google Scholar
     

  • Coley, C. W. et al. A robotic platform for flow synthesis of organic compounds informed by AI planning. Science 365, eaax1566 (2019).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Burger, B. et al. A mobile robotic chemist. Nature 583, 237–241 (2020).

    Article 
    ADS 
    CAS 
    PubMed 

    Google Scholar
     

  • Auto-GPT: the heart of the open-source agent ecosystem. GitHub https://github.com/Significant-Gravitas/AutoGPT (2023).

  • BabyAGI. GitHub https://github.com/yoheinakajima/babyagi (2023).

  • Chase, H. LangChain. GitHub https://github.com/langchain-ai/langchain (2023).

  • Bran, A. M., Cox, S., White, A. D. & Schwaller, P. ChemCrow: augmenting large-language models with chemistry tools. Preprint at https://arxiv.org/abs/2304.05376 (2023).

  • Liu, P. et al. Pre-train, prompt, and predict: a systematic survey of prompting methods in natural language processing. ACM Comput. Surv. 55, 195 (2021).

  • Bai, Y. et al. Constitutional AI: harmlessness from AI feedback. Preprint at https://arxiv.org/abs/2212.08073 (2022).

  • Falcon LLM. TII https://falconllm.tii.ae (2023).

  • Open LLM Leaderboard. Hugging Face https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard (2023).

  • Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55, 248 (2023).

    Article 

    Google Scholar
     

  • Reaxys https://www.reaxys.com (2023).

  • SciFinder https://scifinder.cas.org (2023).

  • Yao, S. et al. ReAct: synergizing reasoning and acting in language models. In Proc.11th International Conference on Learning Representations (ICLR, 2022).

  • Wei, J. et al. Chain-of-thought prompting elicits reasoning in large language models. In Advances in Neural Information Processing Systems 24824–24837 (NeurIPS, 2022).

  • Long, J. Large language model guided tree-of-thought. Preprint at https://arxiv.org/abs/2305.08291 (2023).

  • Opentrons Python Protocol API. Opentrons https://docs.opentrons.com/v2/ (2023).

  • Tu, Z. et al. Approximate nearest neighbor search and lightweight dense vector reranking in multi-stage retrieval architectures. In Proc. 2020 ACM SIGIR on International Conference on Theory of Information Retrieval 97–100 (ACM, 2020).

  • Lin, J. et al. Pyserini: a python toolkit for reproducible information retrieval research with sparse and dense representations. In Proc. 44th International ACM SIGIR Conference on Research and Development in Information Retrieval 2356–2362 (ACM, 2021).

  • Qadrud-Din, J. et al. Transformer based language models for similar text retrieval and ranking. Preprint at https://arxiv.org/abs/2005.04588 (2020).

  • Paper QA. GitHub https://github.com/whitead/paper-qa (2023).

  • Robertson, S. & Zaragoza, H. The probabilistic relevance framework: BM25 and beyond. Found. Trends Inf. Retrieval 3, 333–389 (2009).

    Article 

    Google Scholar
     

  • Data Mining. Mining of Massive Datasets (Cambridge Univ., 2011).

  • Johnson, J., Douze, M. & Jegou, H. Billion-scale similarity search with GPUs. IEEE Trans. Big Data 7, 535–547 (2021).

    Article 

    Google Scholar
     

  • Vechtomova, O. & Wang, Y. A study of the effect of term proximity on query expansion. J. Inf. Sci. 32, 324–333 (2006).

    Article 

    Google Scholar
     

  • Running experiments. Emerald Cloud Lab https://www.emeraldcloudlab.com/guides/runningexperiments (2023).

  • Sanchez-Garcia, R. et al. CoPriNet: graph neural networks provide accurate and rapid compound price prediction for molecule prioritisation. Digital Discov. 2, 103–111 (2023).

    Article 

    Google Scholar
     

  • Bubeck, S. et al. Sparks of artificial general intelligence: early experiments with GPT-4. Preprint at https://arxiv.org/abs/2303.12712 (2023).

  • Ramos, M. C., Michtavy, S. S., Porosoff, M. D. & White, A. D. Bayesian optimization of catalysts with in-context learning. Preprint at https://arxiv.org/abs/2304.05341 (2023).

  • Perera, D. et al. A platform for automated nanomole-scale reaction screening and micromole-scale synthesis in flow. Science 359, 429–434 (2018).

    Article 
    ADS 
    CAS 
    PubMed 

    Google Scholar
     

  • Ahneman, D. T., Estrada, J. G., Lin, S., Dreher, S. D. & Doyle, A. G. Predicting reaction performance in C–N cross-coupling using machine learning. Science 360, 186–190 (2018).

    Article 
    ADS 
    CAS 
    PubMed 

    Google Scholar
     

  • Hickman, R. et al. Atlas: a brain for self-driving laboratories. Preprint at https://chemrxiv.org/engage/chemrxiv/article-details/64f6560579853bbd781bcef6 (2023).

  • Reference

    Denial of responsibility! Elite News is an automatic aggregator of Global media. In each content, the hyperlink to the primary source is specified. All trademarks belong to their rightful owners, and all materials to their authors. For any complaint, please reach us at – [email protected]. We will take necessary action within 24 hours.
    DMCA compliant image

    Leave a comment