AstroMLab

AstroMLab is a diverse group of researchers dedicated to advancing the application of Large Language Models (LLMs) in astronomy. Our team includes:

Leading astronomers, astrophysicists, and cosmologists.
Natural language processing experts.
Frontier arXivists from the NASA Astrophysics Data System

Objectives

Develop specialized LLMs for astronomy
Create open-source models for advanced research
Facilitate LLM-driven end-to-end research in astronomy

Current Work

Our ongoing projects include:

Curation of an astronomy-based benchmarking dataset
Development of specialized astronomy LLMs
Performance evaluation of models on astronomical tasks

Models and Performance

We have developed several models, including AstroSage-8B, AstroLLaMA-2-70B, and AstroLLaMA-3-8B. Our AstroSage-8B model has demonstrated strong performance in astronomy Q&A tasks (Ting et al. 2024, Pan et al. 2024):

Model	Score (%)
AstroSage-8B (AstroMLab)	77.2
LLaMA-3.1-8B	73.7
AstroLLaMA-2-70B (AstroMLab)	72.3
Gemma-2-9B	71.5
Qwen-2.5-7B	70.4
Yi-1.5-9B	68.4
InternLM-2.5-7B	64.0
Mistral-7B-v0.3	63.9
ChatGLM3-6B	50.4
AstroLLaMA-2-7B (UniverseTBD)	44.3

AstroSage-8B, our lightweight model, currently achieves the highest score in this comparison.

Support and Resources

Our research benefits from:

Access to the Frontier nodes at Oak Ridge Leadership Computing Facility
Support from Microsoft's Accelerating Foundation Models Research (AFMR) program

Contact

For inquiries or collaboration opportunities, please contact: [email protected]