# AstroMLab: Advancing Astronomy with AI

## Who We Are

AstroMLab is a diverse group of researchers dedicated to advancing the application of Large Language Models (LLMs) in astronomy. Our team includes:

- Leading astronomers, astrophysicists, and cosmologists
- Natural language processing experts from Oak Ridge National Laboratory and Argonne National Laboratory
- Frontier arXivists from the NASA Astrophysics Data System
- Early career researchers bridging astronomy and AI

## Objectives

- Develop specialized LLMs for astronomy
- Create open-source models for advanced research
- Facilitate LLM-driven end-to-end research in astronomy

## Current Work

Our ongoing projects include:

- Curation of an astronomy-based benchmarking dataset
- Development of specialized astronomy LLMs
- Performance evaluation of models on astronomical tasks

## Models and Performance

We have developed several models, including AstroSage-8B, AstroLLaMA-2-70B, and AstroLLaMA-3-8B. Our AstroSage-8B model has demonstrated strong performance in astronomy Q&A tasks:

| Model | Score (%) |
|-------|-----------|
| AstroSage-8B (AstroMLab) | 77.2 |
| LLaMA-3.1-8B | 73.7 |
| AstroLLaMA-2-70B (AstroMLab) | 72.3 |
| Gemma-2-9B | 71.5 |
| Qwen-2.5-7B | 70.4 |
| Yi-1.5-9B | 68.4 |
| InternLM-2.5-7B | 64.0 |
| Mistral-7B-v0.3 | 63.9 |
| ChatGLM3-6B | 50.4 |
| AstroLLaMA-2-7B (UniverseTBD) | 44.3 |

AstroSage-8B, our lightweight model, currently achieves the highest score in this comparison.

## Support and Resources

Our research benefits from:

- Access to the Frontier nodes at Oak Ridge Leadership Computing Facility
- Support from Microsoft's Accelerating Foundation Models Research (AFMR) program

## Open Science

We are committed to open science principles. Our models are available on [Hugging Face](https://huggingface.co/AstroMLab).

## Contact

For inquiries or collaboration opportunities, please contact: astromachinelearninglab@gmail.com