Configuration Parsing Warning: In UNKNOWN_FILENAME: "auto_map.AutoTokenizer" must be a string

Evo 1.5

About

Evo is a biological foundation model capable of long-context modeling and design.

Evo uses the StripedHyena architecture to enable modeling of sequences at a single-nucleotide, byte-level resolution with near-linear scaling of compute and memory relative to context length. Evo has 7 billion parameters and is trained on OpenGenome, a prokaryotic whole-genome dataset containing ~300 billion tokens.

Evo 1.5 is a version of Evo built off of the Evo 1 model pretrained at 8k context with training extended by 50% more training data, totaling 450 billion tokens.

Checkpoint Name Description
evo-1.5-8k-base A model pretrained with 8,192 context obtained by extending the pretraining of evo-1-8k-base to process 50% more training data.
evo-1-8k-base A model pretrained with 8,192 context. We use this model as the base model for molecular-scale finetuning tasks.
evo-1-131k-base A model pretrained with 131,072 context using evo-1-8k-base as the initialization. We use this model to reason about and generate sequences at the genome scale.
evo-1-8k-crispr A model fine-tuned on evo-1-8k-base specifically on CRISPR-Cas systems. We use this model to generate Cas9/12/13 systems.
evo-1-8k-transposon A model fine-tuned on evo-1-8k-base specifically on transposons. We use this to generate IS200/IS605.

How to use Evo

Example usage is provided in the standalone repo.

Cite

@article{nguyen2024sequence,
   author = {Eric Nguyen and Michael Poli and Matthew G. Durrant and Brian Kang and Dhruva Katrekar and David B. Li and Liam J. Bartie and Armin W. Thomas and Samuel H. King and Garyk Brixi and Jeremy Sullivan and Madelena Y. Ng and Ashley Lewis and Aaron Lou and Stefano Ermon and Stephen A. Baccus and Tina Hernandez-Boussard and Christopher Ré and Patrick D. Hsu and Brian L. Hie },
   title = {Sequence modeling and design from molecular to genome scale with Evo},
   journal = {Science},
   volume = {386},
   number = {6723},
   pages = {eado9336},
   year = {2024},
   doi = {10.1126/science.ado9336},
   URL = {https://www.science.org/doi/abs/10.1126/science.ado9336},
}
Downloads last month
119
Safetensors
Model size
6.45B params
Tensor type
F32
·
BF16
·
Inference API
Unable to determine this model's library. Check the docs .